Dataset statistics
| Number of variables | 31 |
|---|---|
| Number of observations | 96082 |
| Missing cells | 399799 |
| Missing cells (%) | 13.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 22.7 MiB |
| Average record size in memory | 248.0 B |
Variable types
| Categorical | 23 |
|---|---|
| Numeric | 8 |
CIVILIAN_TARGETING has constant value "" | Constant |
EVENT_ID_CNTY has a high cardinality: 96082 distinct values | High cardinality |
EVENT_DATE has a high cardinality: 1901 distinct values | High cardinality |
ACTOR1 has a high cardinality: 115 distinct values | High cardinality |
ASSOC_ACTOR_1 has a high cardinality: 691 distinct values | High cardinality |
ACTOR2 has a high cardinality: 88 distinct values | High cardinality |
ASSOC_ACTOR_2 has a high cardinality: 378 distinct values | High cardinality |
ADMIN2 has a high cardinality: 150 distinct values | High cardinality |
ADMIN3 has a high cardinality: 776 distinct values | High cardinality |
LOCATION has a high cardinality: 2470 distinct values | High cardinality |
SOURCE has a high cardinality: 4819 distinct values | High cardinality |
NOTES has a high cardinality: 95577 distinct values | High cardinality |
TAGS has a high cardinality: 358 distinct values | High cardinality |
YEAR is highly overall correlated with INTER1 and 2 other fields | High correlation |
INTER1 is highly overall correlated with YEAR and 6 other fields | High correlation |
INTER2 is highly overall correlated with ACTOR2 | High correlation |
INTERACTION is highly overall correlated with YEAR and 7 other fields | High correlation |
LATITUDE is highly overall correlated with ISO and 3 other fields | High correlation |
LONGITUDE is highly overall correlated with INTERACTION and 1 other fields | High correlation |
TIMESTAMP is highly overall correlated with YEAR and 2 other fields | High correlation |
DISORDER_TYPE is highly overall correlated with INTER1 and 4 other fields | High correlation |
EVENT_TYPE is highly overall correlated with INTER1 and 4 other fields | High correlation |
SUB_EVENT_TYPE is highly overall correlated with INTER1 and 3 other fields | High correlation |
ACTOR2 is highly overall correlated with INTER1 and 8 other fields | High correlation |
ISO is highly overall correlated with LATITUDE and 4 other fields | High correlation |
REGION is highly overall correlated with LATITUDE and 4 other fields | High correlation |
COUNTRY is highly overall correlated with LATITUDE and 4 other fields | High correlation |
ADMIN1 is highly overall correlated with LATITUDE and 4 other fields | High correlation |
GEO_PRECISION is highly overall correlated with ACTOR2 | High correlation |
TIME_PRECISION is highly imbalanced (94.6%) | Imbalance |
DISORDER_TYPE is highly imbalanced (75.2%) | Imbalance |
SUB_EVENT_TYPE is highly imbalanced (60.7%) | Imbalance |
ACTOR1 is highly imbalanced (63.4%) | Imbalance |
ACTOR2 is highly imbalanced (63.3%) | Imbalance |
ASSOC_ACTOR_2 is highly imbalanced (72.3%) | Imbalance |
ISO is highly imbalanced (99.9%) | Imbalance |
REGION is highly imbalanced (99.9%) | Imbalance |
COUNTRY is highly imbalanced (99.9%) | Imbalance |
ADMIN1 is highly imbalanced (50.7%) | Imbalance |
SOURCE is highly imbalanced (58.6%) | Imbalance |
SOURCE_SCALE is highly imbalanced (64.8%) | Imbalance |
TAGS is highly imbalanced (56.5%) | Imbalance |
ASSOC_ACTOR_1 has 89594 (93.2%) missing values | Missing |
ACTOR2 has 44253 (46.1%) missing values | Missing |
ASSOC_ACTOR_2 has 81404 (84.7%) missing values | Missing |
CIVILIAN_TARGETING has 91894 (95.6%) missing values | Missing |
ADMIN3 has 2402 (2.5%) missing values | Missing |
TAGS has 90144 (93.8%) missing values | Missing |
FATALITIES is highly skewed (γ1 = 43.39440215) | Skewed |
EVENT_ID_CNTY is uniformly distributed | Uniform |
NOTES is uniformly distributed | Uniform |
EVENT_ID_CNTY has unique values | Unique |
INTER2 has 44253 (46.1%) zeros | Zeros |
FATALITIES has 91308 (95.0%) zeros | Zeros |
Reproduction
| Analysis started | 2023-03-29 12:44:50.517105 |
|---|---|
| Analysis finished | 2023-03-29 12:45:15.929977 |
| Duration | 25.41 seconds |
| Software version | ydata-profiling vv4.1.2 |
| Download configuration | config.json |
EVENT_ID_CNTY
Categorical
HIGH CARDINALITY  UNIFORM  UNIQUE 
| Distinct | 96082 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| ROU448 | 1 |
|---|---|
| UKR63701 | 1 |
| UKR63655 | 1 |
| UKR63686 | 1 |
| UKR63648 | 1 |
| Other values (96077) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.8864095 |
| Min length | 4 |
Characters and Unicode
| Total characters | 757742 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 96082 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | ROU448 |
|---|---|
| 2nd row | ROU1885 |
| 3rd row | ROU1940 |
| 4th row | ROU1945 |
| 5th row | ROU1947 |
Common Values
| Value | Count | Frequency (%) |
| ROU448 | 1 | < 0.1% |
| UKR63701 | 1 | < 0.1% |
| UKR63655 | 1 | < 0.1% |
| UKR63686 | 1 | < 0.1% |
| UKR63648 | 1 | < 0.1% |
| UKR63722 | 1 | < 0.1% |
| UKR63647 | 1 | < 0.1% |
| UKR63719 | 1 | < 0.1% |
| UKR63671 | 1 | < 0.1% |
| UKR63645 | 1 | < 0.1% |
| Other values (96072) | 96072 |
Length
| Value | Count | Frequency (%) |
| rou448 | 1 | < 0.1% |
| tur25961 | 1 | < 0.1% |
| rou1947 | 1 | < 0.1% |
| rou1961 | 1 | < 0.1% |
| rou2026 | 1 | < 0.1% |
| rou2045 | 1 | < 0.1% |
| tur14260 | 1 | < 0.1% |
| tur18570 | 1 | < 0.1% |
| tur21094 | 1 | < 0.1% |
| ukr6 | 1 | < 0.1% |
| Other values (96072) | 96072 |
Most occurring characters
| Value | Count | Frequency (%) |
| R | 96082 | |
| U | 96082 | |
| K | 96067 | |
| 3 | 48836 | 6.4% |
| 2 | 48820 | 6.4% |
| 1 | 48792 | 6.4% |
| 5 | 48772 | 6.4% |
| 4 | 48759 | 6.4% |
| 6 | 48111 | 6.3% |
| 8 | 47752 | 6.3% |
| Other values (5) | 129669 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 469496 | |
| Uppercase Letter | 288246 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 48836 | |
| 2 | 48820 | |
| 1 | 48792 | |
| 5 | 48772 | |
| 4 | 48759 | |
| 6 | 48111 | |
| 8 | 47752 | |
| 7 | 47733 | |
| 9 | 44148 | |
| 0 | 37773 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 96082 | |
| U | 96082 | |
| K | 96067 | |
| O | 8 | < 0.1% |
| T | 7 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 469496 | |
| Latin | 288246 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 3 | 48836 | |
| 2 | 48820 | |
| 1 | 48792 | |
| 5 | 48772 | |
| 4 | 48759 | |
| 6 | 48111 | |
| 8 | 47752 | |
| 7 | 47733 | |
| 9 | 44148 | |
| 0 | 37773 |
Latin
| Value | Count | Frequency (%) |
| R | 96082 | |
| U | 96082 | |
| K | 96067 | |
| O | 8 | < 0.1% |
| T | 7 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 757742 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| R | 96082 | |
| U | 96082 | |
| K | 96067 | |
| 3 | 48836 | 6.4% |
| 2 | 48820 | 6.4% |
| 1 | 48792 | 6.4% |
| 5 | 48772 | 6.4% |
| 4 | 48759 | 6.4% |
| 6 | 48111 | 6.3% |
| 8 | 47752 | 6.3% |
| Other values (5) | 129669 |
EVENT_DATE
Categorical
| Distinct | 1901 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| 03-August-2022 | 215 |
|---|---|
| 07-September-2022 | 212 |
| 08-September-2022 | 208 |
| 18-August-2022 | 194 |
| 28-July-2022 | 190 |
| Other values (1896) |
Length
| Max length | 17 |
|---|---|
| Median length | 15 |
| Mean length | 14.213058 |
| Min length | 11 |
Characters and Unicode
| Total characters | 1365619 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 20-May-2019 |
|---|---|
| 2nd row | 28-March-2022 |
| 3rd row | 28-July-2022 |
| 4th row | 31-July-2022 |
| 5th row | 04-August-2022 |
Common Values
| Value | Count | Frequency (%) |
| 03-August-2022 | 215 | 0.2% |
| 07-September-2022 | 212 | 0.2% |
| 08-September-2022 | 208 | 0.2% |
| 18-August-2022 | 194 | 0.2% |
| 28-July-2022 | 190 | 0.2% |
| 21-September-2022 | 190 | 0.2% |
| 08-February-2023 | 189 | 0.2% |
| 01-March-2023 | 189 | 0.2% |
| 01-September-2022 | 188 | 0.2% |
| 14-July-2022 | 184 | 0.2% |
| Other values (1891) | 94123 |
Length
| Value | Count | Frequency (%) |
| 03-august-2022 | 215 | 0.2% |
| 07-september-2022 | 212 | 0.2% |
| 08-september-2022 | 208 | 0.2% |
| 18-august-2022 | 194 | 0.2% |
| 28-july-2022 | 190 | 0.2% |
| 21-september-2022 | 190 | 0.2% |
| 08-february-2023 | 189 | 0.2% |
| 01-march-2023 | 189 | 0.2% |
| 01-september-2022 | 188 | 0.2% |
| 14-july-2022 | 184 | 0.2% |
| Other values (1891) | 94123 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 237078 | |
| - | 192164 | |
| 0 | 143711 | 10.5% |
| e | 88029 | 6.4% |
| 1 | 84036 | 6.2% |
| r | 74751 | 5.5% |
| u | 47799 | 3.5% |
| a | 42338 | 3.1% |
| b | 41442 | 3.0% |
| y | 33215 | 2.4% |
| Other values (27) | 381056 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 576492 | |
| Lowercase Letter | 500881 | |
| Dash Punctuation | 192164 | 14.1% |
| Uppercase Letter | 96082 | 7.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 88029 | |
| r | 74751 | |
| u | 47799 | |
| a | 42338 | |
| b | 41442 | |
| y | 33215 | 6.6% |
| c | 24747 | 4.9% |
| t | 23973 | 4.8% |
| m | 23754 | 4.7% |
| o | 16555 | 3.3% |
| Other values (8) | 84278 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 237078 | |
| 0 | 143711 | |
| 1 | 84036 | 14.6% |
| 9 | 25526 | 4.4% |
| 8 | 24904 | 4.3% |
| 3 | 22984 | 4.0% |
| 5 | 9700 | 1.7% |
| 4 | 9692 | 1.7% |
| 7 | 9549 | 1.7% |
| 6 | 9312 | 1.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 23323 | |
| M | 17062 | |
| A | 14255 | |
| F | 9430 | |
| N | 8297 | 8.6% |
| O | 8258 | 8.6% |
| S | 8192 | 8.5% |
| D | 7265 | 7.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 192164 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 768656 | |
| Latin | 596963 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 88029 | |
| r | 74751 | |
| u | 47799 | 8.0% |
| a | 42338 | 7.1% |
| b | 41442 | 6.9% |
| y | 33215 | 5.6% |
| c | 24747 | 4.1% |
| t | 23973 | 4.0% |
| m | 23754 | 4.0% |
| J | 23323 | 3.9% |
| Other values (16) | 173592 |
Common
| Value | Count | Frequency (%) |
| 2 | 237078 | |
| - | 192164 | |
| 0 | 143711 | |
| 1 | 84036 | 10.9% |
| 9 | 25526 | 3.3% |
| 8 | 24904 | 3.2% |
| 3 | 22984 | 3.0% |
| 5 | 9700 | 1.3% |
| 4 | 9692 | 1.3% |
| 7 | 9549 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1365619 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 237078 | |
| - | 192164 | |
| 0 | 143711 | 10.5% |
| e | 88029 | 6.4% |
| 1 | 84036 | 6.2% |
| r | 74751 | 5.5% |
| u | 47799 | 3.5% |
| a | 42338 | 3.1% |
| b | 41442 | 3.0% |
| y | 33215 | 2.4% |
| Other values (27) | 381056 |
YEAR
Real number (ℝ)
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2020.6526 |
| Minimum | 2018 |
|---|---|
| Maximum | 2023 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 2018 |
|---|---|
| 5-th percentile | 2018 |
| Q1 | 2019 |
| median | 2021 |
| Q3 | 2022 |
| 95-th percentile | 2023 |
| Maximum | 2023 |
| Range | 5 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.6932727 |
|---|---|
| Coefficient of variation (CV) | 0.00083798308 |
| Kurtosis | -1.37338 |
| Mean | 2020.6526 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.32017167 |
| Sum | 1.9414834 × 108 |
| Variance | 2.8671724 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2022 | 36303 | |
| 2019 | 16557 | |
| 2018 | 15191 | |
| 2020 | 9938 | 10.3% |
| 2023 | 9472 | 9.9% |
| 2021 | 8621 | 9.0% |
| Value | Count | Frequency (%) |
| 2018 | 15191 | |
| 2019 | 16557 | |
| 2020 | 9938 | 10.3% |
| 2021 | 8621 | 9.0% |
| 2022 | 36303 | |
| 2023 | 9472 | 9.9% |
| Value | Count | Frequency (%) |
| 2023 | 9472 | 9.9% |
| 2022 | 36303 | |
| 2021 | 8621 | 9.0% |
| 2020 | 9938 | 10.3% |
| 2019 | 16557 | |
| 2018 | 15191 |
TIME_PRECISION
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| 1 | |
|---|---|
| 2 | 779 |
| 3 | 161 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 96082 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 95142 | |
| 2 | 779 | 0.8% |
| 3 | 161 | 0.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 95142 | |
| 2 | 779 | 0.8% |
| 3 | 161 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 95142 | |
| 2 | 779 | 0.8% |
| 3 | 161 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 96082 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 95142 | |
| 2 | 779 | 0.8% |
| 3 | 161 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 96082 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 95142 | |
| 2 | 779 | 0.8% |
| 3 | 161 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 96082 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 95142 | |
| 2 | 779 | 0.8% |
| 3 | 161 | 0.2% |
DISORDER_TYPE
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Political violence | |
|---|---|
| Demonstrations | 5799 |
| Strategic developments | 2381 |
| Political violence; Demonstrations | 18 |
Length
| Max length | 34 |
|---|---|
| Median length | 18 |
| Mean length | 17.860702 |
| Min length | 14 |
Characters and Unicode
| Total characters | 1716092 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Political violence |
|---|---|
| 2nd row | Strategic developments |
| 3rd row | Demonstrations |
| 4th row | Strategic developments |
| 5th row | Demonstrations |
Common Values
| Value | Count | Frequency (%) |
| Political violence | 87884 | |
| Demonstrations | 5799 | 6.0% |
| Strategic developments | 2381 | 2.5% |
| Political violence; Demonstrations | 18 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| political | 87902 | |
| violence | 87902 | |
| demonstrations | 5817 | 3.1% |
| strategic | 2381 | 1.3% |
| developments | 2381 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 271904 | |
| l | 266087 | |
| e | 191145 | |
| o | 189819 | |
| c | 178185 | |
| t | 106679 | 6.2% |
| n | 101917 | 5.9% |
| a | 96100 | 5.6% |
| 90301 | 5.3% | |
| v | 90283 | 5.3% |
| Other values (10) | 133672 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1529673 | |
| Uppercase Letter | 96100 | 5.6% |
| Space Separator | 90301 | 5.3% |
| Other Punctuation | 18 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 271904 | |
| l | 266087 | |
| e | 191145 | |
| o | 189819 | |
| c | 178185 | |
| t | 106679 | 7.0% |
| n | 101917 | 6.7% |
| a | 96100 | 6.3% |
| v | 90283 | 5.9% |
| s | 14015 | 0.9% |
| Other values (5) | 23539 | 1.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 87902 | |
| D | 5817 | 6.1% |
| S | 2381 | 2.5% |
Space Separator
| Value | Count | Frequency (%) |
| 90301 |
Other Punctuation
| Value | Count | Frequency (%) |
| ; | 18 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1625773 | |
| Common | 90319 | 5.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 271904 | |
| l | 266087 | |
| e | 191145 | |
| o | 189819 | |
| c | 178185 | |
| t | 106679 | 6.6% |
| n | 101917 | 6.3% |
| a | 96100 | 5.9% |
| v | 90283 | 5.6% |
| P | 87902 | 5.4% |
| Other values (8) | 45752 | 2.8% |
Common
| Value | Count | Frequency (%) |
| 90301 | ||
| ; | 18 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1716092 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 271904 | |
| l | 266087 | |
| e | 191145 | |
| o | 189819 | |
| c | 178185 | |
| t | 106679 | 6.2% |
| n | 101917 | 5.9% |
| a | 96100 | 5.6% |
| 90301 | 5.3% | |
| v | 90283 | 5.3% |
| Other values (10) | 133672 |
EVENT_TYPE
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Explosions/Remote violence | |
|---|---|
| Battles | |
| Protests | 5581 |
| Strategic developments | 2381 |
| Violence against civilians | 1080 |
Length
| Max length | 26 |
|---|---|
| Median length | 26 |
| Mean length | 18.983962 |
| Min length | 5 |
Characters and Unicode
| Total characters | 1824017 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Violence against civilians |
|---|---|
| 2nd row | Strategic developments |
| 3rd row | Protests |
| 4th row | Strategic developments |
| 5th row | Protests |
Common Values
| Value | Count | Frequency (%) |
| Explosions/Remote violence | 57393 | |
| Battles | 29227 | |
| Protests | 5581 | 5.8% |
| Strategic developments | 2381 | 2.5% |
| Violence against civilians | 1080 | 1.1% |
| Riots | 420 | 0.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| violence | 58473 | |
| explosions/remote | 57393 | |
| battles | 29227 | |
| protests | 5581 | 3.5% |
| strategic | 2381 | 1.5% |
| developments | 2381 | 1.5% |
| against | 1080 | 0.7% |
| civilians | 1080 | 0.7% |
| riots | 420 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 276064 | |
| o | 239034 | |
| s | 160136 | 8.8% |
| l | 148554 | 8.1% |
| t | 135652 | 7.4% |
| i | 122987 | 6.7% |
| n | 120407 | 6.6% |
| 61934 | 3.4% | |
| c | 61934 | 3.4% |
| v | 60854 | 3.3% |
| Other values (14) | 436461 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1551215 | |
| Uppercase Letter | 153475 | 8.4% |
| Space Separator | 61934 | 3.4% |
| Other Punctuation | 57393 | 3.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 276064 | |
| o | 239034 | |
| s | 160136 | |
| l | 148554 | |
| t | 135652 | |
| i | 122987 | |
| n | 120407 | |
| c | 61934 | 4.0% |
| v | 60854 | 3.9% |
| p | 59774 | 3.9% |
| Other values (6) | 165819 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 57813 | |
| E | 57393 | |
| B | 29227 | |
| P | 5581 | 3.6% |
| S | 2381 | 1.6% |
| V | 1080 | 0.7% |
Space Separator
| Value | Count | Frequency (%) |
| 61934 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 57393 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1704690 | |
| Common | 119327 | 6.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 276064 | |
| o | 239034 | |
| s | 160136 | |
| l | 148554 | |
| t | 135652 | |
| i | 122987 | 7.2% |
| n | 120407 | 7.1% |
| c | 61934 | 3.6% |
| v | 60854 | 3.6% |
| p | 59774 | 3.5% |
| Other values (12) | 319294 |
Common
| Value | Count | Frequency (%) |
| 61934 | ||
| / | 57393 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1824017 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 276064 | |
| o | 239034 | |
| s | 160136 | 8.8% |
| l | 148554 | 8.1% |
| t | 135652 | 7.4% |
| i | 122987 | 6.7% |
| n | 120407 | 6.6% |
| 61934 | 3.4% | |
| c | 61934 | 3.4% |
| v | 60854 | 3.3% |
| Other values (14) | 436461 |
SUB_EVENT_TYPE
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Shelling/artillery/missile attack | |
|---|---|
| Armed clash | |
| Peaceful protest | |
| Air/drone strike | 2497 |
| Disrupted weapons use | 1077 |
| Other values (19) | 4417 |
Length
| Max length | 35 |
|---|---|
| Median length | 33 |
| Mean length | 24.348942 |
| Min length | 5 |
Characters and Unicode
| Total characters | 2339495 |
|---|---|
| Distinct characters | 42 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Attack |
|---|---|
| 2nd row | Disrupted weapons use |
| 3rd row | Peaceful protest |
| 4th row | Disrupted weapons use |
| 5th row | Peaceful protest |
Common Values
| Value | Count | Frequency (%) |
| Shelling/artillery/missile attack | 53997 | |
| Armed clash | 28667 | |
| Peaceful protest | 5427 | 5.6% |
| Air/drone strike | 2497 | 2.6% |
| Disrupted weapons use | 1077 | 1.1% |
| Remote explosive/landmine/IED | 818 | 0.9% |
| Attack | 719 | 0.7% |
| Looting/property destruction | 389 | 0.4% |
| Other | 377 | 0.4% |
| Abduction/forced disappearance | 324 | 0.3% |
| Other values (14) | 1790 | 1.9% |
Length
| Value | Count | Frequency (%) |
| attack | 54716 | |
| shelling/artillery/missile | 53997 | |
| armed | 28667 | |
| clash | 28667 | |
| protest | 5563 | 2.9% |
| peaceful | 5427 | 2.8% |
| air/drone | 2497 | 1.3% |
| strike | 2497 | 1.3% |
| disrupted | 1077 | 0.6% |
| weapons | 1077 | 0.6% |
| Other values (40) | 9349 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 306331 | |
| i | 228020 | 9.7% |
| e | 226093 | 9.7% |
| a | 201909 | 8.6% |
| t | 186371 | 8.0% |
| r | 157359 | 6.7% |
| s | 150863 | 6.4% |
| / | 113164 | 4.8% |
| 97452 | 4.2% | |
| c | 91014 | 3.9% |
| Other values (32) | 580919 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2029967 | |
| Other Punctuation | 113164 | 4.8% |
| Uppercase Letter | 98536 | 4.2% |
| Space Separator | 97452 | 4.2% |
| Dash Punctuation | 376 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 306331 | |
| i | 228020 | |
| e | 226093 | |
| a | 201909 | |
| t | 186371 | |
| r | 157359 | |
| s | 150863 | |
| c | 91014 | 4.5% |
| m | 84883 | 4.2% |
| h | 83504 | 4.1% |
| Other values (14) | 313620 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 54039 | |
| A | 32299 | |
| P | 5563 | 5.6% |
| D | 1895 | 1.9% |
| E | 836 | 0.8% |
| R | 818 | 0.8% |
| I | 818 | 0.8% |
| L | 389 | 0.4% |
| G | 379 | 0.4% |
| O | 377 | 0.4% |
| Other values (5) | 1123 | 1.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 113164 |
Space Separator
| Value | Count | Frequency (%) |
| 97452 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 376 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2128503 | |
| Common | 210992 | 9.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 306331 | |
| i | 228020 | |
| e | 226093 | |
| a | 201909 | |
| t | 186371 | |
| r | 157359 | 7.4% |
| s | 150863 | 7.1% |
| c | 91014 | 4.3% |
| m | 84883 | 4.0% |
| h | 83504 | 3.9% |
| Other values (29) | 412156 |
Common
| Value | Count | Frequency (%) |
| / | 113164 | |
| 97452 | ||
| - | 376 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2339495 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| l | 306331 | |
| i | 228020 | 9.7% |
| e | 226093 | 9.7% |
| a | 201909 | 8.6% |
| t | 186371 | 8.0% |
| r | 157359 | 6.7% |
| s | 150863 | 6.4% |
| / | 113164 | 4.8% |
| 97452 | 4.2% | |
| c | 91014 | 3.9% |
| Other values (32) | 580919 |
ACTOR1
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 115 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Military Forces of Russia (2000-) | |
|---|---|
| NAF: United Armed Forces of Novorossiya | |
| Military Forces of Ukraine (2019-) | |
| Military Forces of Ukraine (2014-2019) | |
| Protesters (Ukraine) | |
| Other values (110) |
Length
| Max length | 148 |
|---|---|
| Median length | 71 |
| Mean length | 34.598832 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3324325 |
|---|---|
| Distinct characters | 60 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 46 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Police Forces of Romania (2016-2019) Coast Guard |
|---|---|
| 2nd row | Military Forces of Romania (2021-) |
| 3rd row | Protesters (Romania) |
| 4th row | Military Forces of Romania (2021-) |
| 5th row | Protesters (Romania) |
Common Values
| Value | Count | Frequency (%) |
| Military Forces of Russia (2000-) | 32391 | |
| NAF: United Armed Forces of Novorossiya | 20543 | |
| Military Forces of Ukraine (2019-) | 20063 | |
| Military Forces of Ukraine (2014-2019) | 12230 | 12.7% |
| Protesters (Ukraine) | 5533 | 5.8% |
| Military Forces of Russia (2000-) Air Force | 2248 | 2.3% |
| Unidentified Armed Group (Ukraine) | 1010 | 1.1% |
| Rioters (Ukraine) | 418 | 0.4% |
| Civilians (Ukraine) | 334 | 0.3% |
| Military Forces of Ukraine (2019-) Air Force | 204 | 0.2% |
| Other values (105) | 1108 | 1.2% |
Length
| Value | Count | Frequency (%) |
| of | 88484 | |
| forces | 88452 | |
| military | 67458 | |
| ukraine | 40607 | |
| russia | 34765 | 7.1% |
| 2000 | 34737 | 7.1% |
| armed | 21559 | 4.4% |
| 2019 | 20589 | 4.2% |
| united | 20545 | 4.2% |
| naf | 20543 | 4.2% |
| Other values (130) | 48612 |
Most occurring characters
| Value | Count | Frequency (%) |
| 390269 | 11.7% | |
| i | 260366 | 7.8% |
| r | 257116 | 7.7% |
| o | 249121 | 7.5% |
| s | 211202 | 6.4% |
| e | 189658 | 5.7% |
| a | 164627 | 5.0% |
| 0 | 149623 | 4.5% |
| F | 111468 | 3.4% |
| t | 101598 | 3.1% |
| Other values (50) | 1239277 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2003437 | |
| Space Separator | 390269 | 11.7% |
| Uppercase Letter | 371168 | 11.2% |
| Decimal Number | 320608 | 9.6% |
| Open Punctuation | 75231 | 2.3% |
| Close Punctuation | 75231 | 2.3% |
| Dash Punctuation | 67788 | 2.0% |
| Other Punctuation | 20593 | 0.6% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 111468 | |
| M | 67661 | |
| U | 62359 | |
| A | 44562 | 12.0% |
| N | 41166 | 11.1% |
| R | 35252 | 9.5% |
| P | 6101 | 1.6% |
| G | 1218 | 0.3% |
| S | 514 | 0.1% |
| C | 483 | 0.1% |
| Other values (13) | 384 | 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 260366 | |
| r | 257116 | |
| o | 249121 | |
| s | 211202 | |
| e | 189658 | |
| a | 164627 | |
| t | 101598 | 5.1% |
| c | 91886 | 4.6% |
| f | 89709 | 4.5% |
| y | 88278 | 4.4% |
| Other values (12) | 299876 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 149623 | |
| 2 | 80155 | |
| 1 | 45417 | 14.2% |
| 9 | 32995 | 10.3% |
| 4 | 12408 | 3.9% |
| 6 | 7 | < 0.1% |
| 7 | 2 | < 0.1% |
| 5 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 20545 | |
| ' | 47 | 0.2% |
| , | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 390269 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 75231 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 75231 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 67788 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2374605 | |
| Common | 949720 | 28.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 260366 | |
| r | 257116 | 10.8% |
| o | 249121 | 10.5% |
| s | 211202 | 8.9% |
| e | 189658 | 8.0% |
| a | 164627 | 6.9% |
| F | 111468 | 4.7% |
| t | 101598 | 4.3% |
| c | 91886 | 3.9% |
| f | 89709 | 3.8% |
| Other values (35) | 647854 |
Common
| Value | Count | Frequency (%) |
| 390269 | ||
| 0 | 149623 | 15.8% |
| 2 | 80155 | 8.4% |
| ( | 75231 | 7.9% |
| ) | 75231 | 7.9% |
| - | 67788 | 7.1% |
| 1 | 45417 | 4.8% |
| 9 | 32995 | 3.5% |
| : | 20545 | 2.2% |
| 4 | 12408 | 1.3% |
| Other values (5) | 58 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3324325 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 390269 | 11.7% | |
| i | 260366 | 7.8% |
| r | 257116 | 7.7% |
| o | 249121 | 7.5% |
| s | 211202 | 6.4% |
| e | 189658 | 5.7% |
| a | 164627 | 5.0% |
| 0 | 149623 | 4.5% |
| F | 111468 | 3.4% |
| t | 101598 | 3.1% |
| Other values (50) | 1239277 |
ASSOC_ACTOR_1
Categorical
HIGH CARDINALITY  MISSING 
| Distinct | 691 |
|---|---|
| Distinct (%) | 10.7% |
| Missing | 89594 |
| Missing (%) | 93.2% |
| Memory size | 750.8 KiB |
| Military Forces of Russia (2000-) | |
|---|---|
| National Corps Party | |
| Labor Group (Ukraine) | |
| Military Forces of Russia (2000-) Air Force | 353 |
| Refugees/IDPs (Ukraine) | 301 |
| Other values (686) |
Length
| Max length | 234 |
|---|---|
| Median length | 177 |
| Mean length | 30.876695 |
| Min length | 3 |
Characters and Unicode
| Total characters | 200328 |
|---|---|
| Distinct characters | 64 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 476 ? |
|---|---|
| Unique (%) | 7.3% |
Sample
| 1st row | Greenpeace |
|---|---|
| 2nd row | Greenpeace |
| 3rd row | Svoboda; Sokil |
| 4th row | Svoboda; Sokil; Military Forces of Ukraine (2014-2019) |
| 5th row | Svoboda; National Corps Party |
Common Values
| Value | Count | Frequency (%) |
| Military Forces of Russia (2000-) | 1060 | 1.1% |
| National Corps Party | 694 | 0.7% |
| Labor Group (Ukraine) | 515 | 0.5% |
| Military Forces of Russia (2000-) Air Force | 353 | 0.4% |
| Refugees/IDPs (Ukraine) | 301 | 0.3% |
| Donbass People's Militia | 251 | 0.3% |
| Stop Corruption | 237 | 0.2% |
| Luhansk People's Militia | 223 | 0.2% |
| Traditions and Order | 186 | 0.2% |
| Wagner Group | 113 | 0.1% |
| Other values (681) | 2555 | 2.7% |
| (Missing) | 89594 |
Length
| Value | Count | Frequency (%) |
| of | 2362 | 8.5% |
| ukraine | 2299 | 8.2% |
| forces | 2182 | 7.8% |
| military | 1962 | 7.0% |
| russia | 1547 | 5.5% |
| 2000 | 1522 | 5.5% |
| party | 1375 | 4.9% |
| national | 1226 | 4.4% |
| corps | 1096 | 3.9% |
| group | 1015 | 3.6% |
| Other values (264) | 11310 |
Most occurring characters
| Value | Count | Frequency (%) |
| 21408 | 10.7% | |
| r | 16541 | 8.3% |
| a | 15098 | 7.5% |
| i | 14907 | 7.4% |
| o | 14858 | 7.4% |
| e | 10774 | 5.4% |
| s | 10316 | 5.1% |
| t | 8771 | 4.4% |
| n | 6970 | 3.5% |
| l | 5526 | 2.8% |
| Other values (54) | 75159 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 131804 | |
| Uppercase Letter | 24840 | 12.4% |
| Space Separator | 21408 | 10.7% |
| Decimal Number | 9392 | 4.7% |
| Open Punctuation | 3942 | 2.0% |
| Close Punctuation | 3942 | 2.0% |
| Other Punctuation | 2816 | 1.4% |
| Dash Punctuation | 2148 | 1.1% |
| Math Symbol | 36 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 16541 | |
| a | 15098 | |
| i | 14907 | |
| o | 14858 | |
| e | 10774 | |
| s | 10316 | |
| t | 8771 | 6.7% |
| n | 6970 | 5.3% |
| l | 5526 | 4.2% |
| u | 4130 | 3.1% |
| Other values (15) | 23913 |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 3337 | |
| M | 2853 | |
| P | 2691 | |
| U | 2565 | |
| R | 2254 | |
| C | 1820 | |
| N | 1558 | 6.3% |
| S | 1324 | 5.3% |
| G | 1190 | 4.8% |
| L | 1025 | 4.1% |
| Other values (14) | 4223 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5358 | |
| 2 | 2312 | |
| 1 | 858 | 9.1% |
| 9 | 595 | 6.3% |
| 4 | 269 | 2.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| ; | 1671 | |
| ' | 619 | 22.0% |
| / | 304 | 10.8% |
| : | 221 | 7.8% |
| , | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 21408 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3942 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3942 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2148 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 36 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 156644 | |
| Common | 43684 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 16541 | 10.6% |
| a | 15098 | 9.6% |
| i | 14907 | 9.5% |
| o | 14858 | 9.5% |
| e | 10774 | 6.9% |
| s | 10316 | 6.6% |
| t | 8771 | 5.6% |
| n | 6970 | 4.4% |
| l | 5526 | 3.5% |
| u | 4130 | 2.6% |
| Other values (39) | 48753 |
Common
| Value | Count | Frequency (%) |
| 21408 | ||
| 0 | 5358 | 12.3% |
| ( | 3942 | 9.0% |
| ) | 3942 | 9.0% |
| 2 | 2312 | 5.3% |
| - | 2148 | 4.9% |
| ; | 1671 | 3.8% |
| 1 | 858 | 2.0% |
| ' | 619 | 1.4% |
| 9 | 595 | 1.4% |
| Other values (5) | 831 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 200328 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 21408 | 10.7% | |
| r | 16541 | 8.3% |
| a | 15098 | 7.5% |
| i | 14907 | 7.4% |
| o | 14858 | 7.4% |
| e | 10774 | 5.4% |
| s | 10316 | 5.1% |
| t | 8771 | 4.4% |
| n | 6970 | 3.5% |
| l | 5526 | 2.8% |
| Other values (54) | 75159 |
INTER1
Real number (ℝ)
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.1179826 |
| Minimum | 1 |
|---|---|
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 8 |
| 95-th percentile | 8 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 3.16869 |
|---|---|
| Coefficient of variation (CV) | 0.76947631 |
| Kurtosis | -1.804966 |
| Mean | 4.1179826 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.29285778 |
| Sum | 395664 |
| Variance | 10.040596 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 34993 | |
| 1 | 33004 | |
| 2 | 20583 | |
| 6 | 5581 | 5.8% |
| 3 | 1042 | 1.1% |
| 5 | 420 | 0.4% |
| 7 | 334 | 0.3% |
| 4 | 125 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 33004 | |
| 2 | 20583 | |
| 3 | 1042 | 1.1% |
| 4 | 125 | 0.1% |
| 5 | 420 | 0.4% |
| 6 | 5581 | 5.8% |
| 7 | 334 | 0.3% |
| 8 | 34993 |
| Value | Count | Frequency (%) |
| 8 | 34993 | |
| 7 | 334 | 0.3% |
| 6 | 5581 | 5.8% |
| 5 | 420 | 0.4% |
| 4 | 125 | 0.1% |
| 3 | 1042 | 1.1% |
| 2 | 20583 | |
| 1 | 33004 |
ACTOR2
Categorical
HIGH CARDINALITY  HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 88 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 44253 |
| Missing (%) | 46.1% |
| Memory size | 750.8 KiB |
| NAF: United Armed Forces of Novorossiya | |
|---|---|
| Military Forces of Ukraine (2019-) | |
| Military Forces of Ukraine (2014-2019) | |
| Civilians (Ukraine) | |
| Military Forces of Russia (2000-) | |
| Other values (83) |
Length
| Max length | 71 |
|---|---|
| Median length | 69 |
| Mean length | 35.536553 |
| Min length | 12 |
Characters and Unicode
| Total characters | 1841824 |
|---|---|
| Distinct characters | 57 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 38 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Civilians (Turkey) |
|---|---|
| 2nd row | Unidentified Military Forces |
| 3rd row | Unidentified Armed Group (International) |
| 4th row | Military Forces of Romania (2021-) |
| 5th row | Unidentified Military Forces |
Common Values
| Value | Count | Frequency (%) |
| NAF: United Armed Forces of Novorossiya | 19387 | |
| Military Forces of Ukraine (2019-) | 14713 | 15.3% |
| Military Forces of Ukraine (2014-2019) | 8045 | 8.4% |
| Civilians (Ukraine) | 4542 | 4.7% |
| Military Forces of Russia (2000-) | 2791 | 2.9% |
| Military Forces of Russia (2000-) Donetsk People's Militia | 667 | 0.7% |
| Military Forces of Russia (2000-) Air Force | 436 | 0.5% |
| Unidentified Armed Group (Ukraine) | 269 | 0.3% |
| Military Forces of Ukraine (2019-) Air Force | 138 | 0.1% |
| Police Forces of Ukraine (2019-) | 124 | 0.1% |
| Other values (78) | 717 | 0.7% |
| (Missing) | 44253 |
Length
| Value | Count | Frequency (%) |
| of | 46724 | |
| forces | 46696 | |
| ukraine | 28289 | |
| military | 26941 | |
| armed | 19658 | |
| united | 19390 | |
| naf | 19387 | |
| novorossiya | 19387 | |
| 2019 | 15070 | 5.6% |
| 2014-2019 | 8167 | 3.1% |
| Other values (85) | 17807 | 6.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 215687 | 11.7% | |
| o | 154541 | 8.4% |
| i | 143302 | 7.8% |
| r | 143095 | 7.8% |
| e | 118477 | 6.4% |
| s | 99995 | 5.4% |
| a | 84569 | 4.6% |
| F | 66661 | 3.6% |
| n | 53971 | 2.9% |
| t | 48636 | 2.6% |
| Other values (47) | 712890 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1139844 | |
| Uppercase Letter | 232268 | 12.6% |
| Space Separator | 215687 | 11.7% |
| Decimal Number | 141808 | 7.7% |
| Close Punctuation | 32403 | 1.8% |
| Open Punctuation | 32403 | 1.8% |
| Dash Punctuation | 27297 | 1.5% |
| Other Punctuation | 20114 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 154541 | |
| i | 143302 | |
| r | 143095 | |
| e | 118477 | |
| s | 99995 | |
| a | 84569 | |
| n | 53971 | 4.7% |
| t | 48636 | 4.3% |
| c | 47792 | 4.2% |
| f | 47021 | 4.1% |
| Other values (16) | 198445 |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 66661 | |
| U | 47970 | |
| A | 39624 | |
| N | 38830 | |
| M | 27669 | |
| C | 4660 | 2.0% |
| R | 4216 | 1.8% |
| P | 1164 | 0.5% |
| D | 680 | 0.3% |
| G | 389 | 0.2% |
| Other values (10) | 405 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 43546 | |
| 2 | 35453 | |
| 1 | 31405 | |
| 9 | 23237 | |
| 4 | 8167 | 5.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 19387 | |
| ' | 727 | 3.6% |
Space Separator
| Value | Count | Frequency (%) |
| 215687 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 32403 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 32403 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 27297 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1372112 | |
| Common | 469712 | 25.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 154541 | 11.3% |
| i | 143302 | 10.4% |
| r | 143095 | 10.4% |
| e | 118477 | 8.6% |
| s | 99995 | 7.3% |
| a | 84569 | 6.2% |
| F | 66661 | 4.9% |
| n | 53971 | 3.9% |
| t | 48636 | 3.5% |
| U | 47970 | 3.5% |
| Other values (36) | 410895 |
Common
| Value | Count | Frequency (%) |
| 215687 | ||
| 0 | 43546 | 9.3% |
| 2 | 35453 | 7.5% |
| ) | 32403 | 6.9% |
| ( | 32403 | 6.9% |
| 1 | 31405 | 6.7% |
| - | 27297 | 5.8% |
| 9 | 23237 | 4.9% |
| : | 19387 | 4.1% |
| 4 | 8167 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1841824 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 215687 | 11.7% | |
| o | 154541 | 8.4% |
| i | 143302 | 7.8% |
| r | 143095 | 7.8% |
| e | 118477 | 6.4% |
| s | 99995 | 5.4% |
| a | 84569 | 4.6% |
| F | 66661 | 3.6% |
| n | 53971 | 2.9% |
| t | 48636 | 2.6% |
| Other values (47) | 712890 |
ASSOC_ACTOR_2
Categorical
HIGH CARDINALITY  IMBALANCE  MISSING 
| Distinct | 378 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 81404 |
| Missing (%) | 84.7% |
| Memory size | 750.8 KiB |
| Donbass People's Militia | |
|---|---|
| Luhansk People's Militia | |
| Donbass People's Militia; Civilians (Ukraine) | 638 |
| Civilians (Ukraine) | 272 |
| Luhansk People's Militia; Civilians (Ukraine) | 161 |
| Other values (373) |
Length
| Max length | 204 |
|---|---|
| Median length | 24 |
| Mean length | 26.599741 |
| Min length | 3 |
Characters and Unicode
| Total characters | 390431 |
|---|---|
| Distinct characters | 68 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 234 ? |
|---|---|
| Unique (%) | 1.6% |
Sample
| 1st row | Fishers (Turkey) |
|---|---|
| 2nd row | Refugees/IDPs (Afghanistan); Civilians (Syria); Refugees/IDPs (Syria); Civilians (Iran); Refugees/IDPs (Iran) |
| 3rd row | Refugees/IDPs (International) |
| 4th row | Refugees/IDPs (International) |
| 5th row | Luhansk People's Militia |
Common Values
| Value | Count | Frequency (%) |
| Donbass People's Militia | 9155 | 9.5% |
| Luhansk People's Militia | 2825 | 2.9% |
| Donbass People's Militia; Civilians (Ukraine) | 638 | 0.7% |
| Civilians (Ukraine) | 272 | 0.3% |
| Luhansk People's Militia; Civilians (Ukraine) | 161 | 0.2% |
| Government of Ukraine (2019-) | 98 | 0.1% |
| Labor Group (Ukraine) | 93 | 0.1% |
| Military Forces of Ukraine (2019-) Air Force | 76 | 0.1% |
| Journalists (Ukraine) | 73 | 0.1% |
| Farmers (Ukraine) | 61 | 0.1% |
| Other values (368) | 1226 | 1.3% |
| (Missing) | 81404 |
Length
| Value | Count | Frequency (%) |
| people's | 12902 | |
| militia | 12882 | |
| donbass | 9864 | |
| luhansk | 3020 | 6.2% |
| ukraine | 2446 | 5.0% |
| civilians | 1145 | 2.3% |
| of | 855 | 1.8% |
| forces | 558 | 1.1% |
| 2019 | 448 | 0.9% |
| military | 376 | 0.8% |
| Other values (234) | 4285 | 8.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 47232 | |
| s | 38781 | 9.9% |
| 34103 | 8.7% | |
| a | 31608 | 8.1% |
| e | 31120 | 8.0% |
| l | 27910 | 7.1% |
| o | 26632 | 6.8% |
| n | 18409 | 4.7% |
| t | 14841 | 3.8% |
| p | 13435 | 3.4% |
| Other values (58) | 106360 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 284934 | |
| Uppercase Letter | 47316 | 12.1% |
| Space Separator | 34103 | 8.7% |
| Other Punctuation | 14295 | 3.7% |
| Decimal Number | 3568 | 0.9% |
| Open Punctuation | 2674 | 0.7% |
| Close Punctuation | 2674 | 0.7% |
| Dash Punctuation | 840 | 0.2% |
| Math Symbol | 27 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 47232 | |
| s | 38781 | |
| a | 31608 | |
| e | 31120 | |
| l | 27910 | |
| o | 26632 | |
| n | 18409 | 6.5% |
| t | 14841 | 5.2% |
| p | 13435 | 4.7% |
| b | 10019 | 3.5% |
| Other values (15) | 24947 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 13298 | |
| P | 13241 | |
| D | 9917 | |
| L | 3214 | 6.8% |
| U | 2521 | 5.3% |
| C | 1377 | 2.9% |
| F | 943 | 2.0% |
| G | 614 | 1.3% |
| R | 383 | 0.8% |
| S | 309 | 0.7% |
| Other values (15) | 1499 | 3.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1222 | |
| 2 | 887 | |
| 1 | 728 | |
| 9 | 585 | |
| 4 | 143 | 4.0% |
| 8 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 12904 | |
| ; | 1293 | 9.0% |
| : | 85 | 0.6% |
| / | 12 | 0.1% |
| , | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 34103 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2674 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2674 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 840 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 27 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 332250 | |
| Common | 58181 | 14.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 47232 | |
| s | 38781 | |
| a | 31608 | |
| e | 31120 | |
| l | 27910 | |
| o | 26632 | 8.0% |
| n | 18409 | 5.5% |
| t | 14841 | 4.5% |
| p | 13435 | 4.0% |
| M | 13298 | 4.0% |
| Other values (40) | 68984 |
Common
| Value | Count | Frequency (%) |
| 34103 | ||
| ' | 12904 | 22.2% |
| ( | 2674 | 4.6% |
| ) | 2674 | 4.6% |
| ; | 1293 | 2.2% |
| 0 | 1222 | 2.1% |
| 2 | 887 | 1.5% |
| - | 840 | 1.4% |
| 1 | 728 | 1.3% |
| 9 | 585 | 1.0% |
| Other values (8) | 271 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 390431 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 47232 | |
| s | 38781 | 9.9% |
| 34103 | 8.7% | |
| a | 31608 | 8.1% |
| e | 31120 | 8.0% |
| l | 27910 | 7.1% |
| o | 26632 | 6.8% |
| n | 18409 | 4.7% |
| t | 14841 | 3.8% |
| p | 13435 | 3.4% |
| Other values (58) | 106360 |
INTER2
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3435295 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 44253 |
| Zeros (%) | 46.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.1025779 |
|---|---|
| Coefficient of variation (CV) | 1.5649659 |
| Kurtosis | 3.9236426 |
| Mean | 1.3435295 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.1941353 |
| Sum | 129089 |
| Variance | 4.4208337 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 44253 | |
| 1 | 23238 | |
| 2 | 19398 | |
| 7 | 4642 | 4.8% |
| 8 | 4100 | 4.3% |
| 3 | 276 | 0.3% |
| 5 | 109 | 0.1% |
| 6 | 62 | 0.1% |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 44253 | |
| 1 | 23238 | |
| 2 | 19398 | |
| 3 | 276 | 0.3% |
| 4 | 4 | < 0.1% |
| 5 | 109 | 0.1% |
| 6 | 62 | 0.1% |
| 7 | 4642 | 4.8% |
| 8 | 4100 | 4.3% |
| Value | Count | Frequency (%) |
| 8 | 4100 | 4.3% |
| 7 | 4642 | 4.8% |
| 6 | 62 | 0.1% |
| 5 | 109 | 0.1% |
| 4 | 4 | < 0.1% |
| 3 | 276 | 0.3% |
| 2 | 19398 | |
| 1 | 23238 | |
| 0 | 44253 |
INTERACTION
Real number (ℝ)
| Distinct | 37 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.572105 |
| Minimum | 10 |
|---|---|
| Maximum | 88 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 10 |
| Q1 | 12 |
| median | 18 |
| Q3 | 78 |
| 95-th percentile | 80 |
| Maximum | 88 |
| Range | 78 |
| Interquartile range (IQR) | 66 |
Descriptive statistics
| Standard deviation | 30.176587 |
|---|---|
| Coefficient of variation (CV) | 0.84832166 |
| Kurtosis | -1.4611103 |
| Mean | 35.572105 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 0.66259346 |
| Sum | 3417839 |
| Variance | 910.62642 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 34488 | |
| 80 | 23643 | |
| 18 | 11696 | 12.2% |
| 10 | 9284 | 9.7% |
| 60 | 5365 | 5.6% |
| 20 | 5349 | 5.6% |
| 78 | 3557 | 3.7% |
| 37 | 644 | 0.7% |
| 70 | 334 | 0.3% |
| 13 | 328 | 0.3% |
| Other values (27) | 1394 | 1.5% |
| Value | Count | Frequency (%) |
| 10 | 9284 | 9.7% |
| 11 | 20 | < 0.1% |
| 12 | 34488 | |
| 13 | 328 | 0.3% |
| 14 | 6 | < 0.1% |
| 15 | 136 | 0.1% |
| 16 | 57 | 0.1% |
| 17 | 207 | 0.2% |
| 18 | 11696 | 12.2% |
| 20 | 5349 | 5.6% |
| Value | Count | Frequency (%) |
| 88 | 30 | < 0.1% |
| 80 | 23643 | |
| 78 | 3557 | 3.7% |
| 70 | 334 | 0.3% |
| 68 | 55 | 0.1% |
| 66 | 62 | 0.1% |
| 60 | 5365 | 5.6% |
| 58 | 20 | < 0.1% |
| 57 | 98 | 0.1% |
| 56 | 40 | < 0.1% |
CIVILIAN_TARGETING
Categorical
CONSTANT  MISSING 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 91894 |
| Missing (%) | 95.6% |
| Memory size | 750.8 KiB |
| Civilian targeting |
|---|
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 18 |
| Min length | 18 |
Characters and Unicode
| Total characters | 75384 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Civilian targeting |
|---|---|
| 2nd row | Civilian targeting |
| 3rd row | Civilian targeting |
| 4th row | Civilian targeting |
| 5th row | Civilian targeting |
Common Values
| Value | Count | Frequency (%) |
| Civilian targeting | 4188 | 4.4% |
| (Missing) | 91894 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| civilian | 4188 | |
| targeting | 4188 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 16752 | |
| a | 8376 | |
| n | 8376 | |
| t | 8376 | |
| g | 8376 | |
| C | 4188 | 5.6% |
| v | 4188 | 5.6% |
| l | 4188 | 5.6% |
| 4188 | 5.6% | |
| r | 4188 | 5.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 67008 | |
| Uppercase Letter | 4188 | 5.6% |
| Space Separator | 4188 | 5.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 16752 | |
| a | 8376 | |
| n | 8376 | |
| t | 8376 | |
| g | 8376 | |
| v | 4188 | 6.2% |
| l | 4188 | 6.2% |
| r | 4188 | 6.2% |
| e | 4188 | 6.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 4188 |
Space Separator
| Value | Count | Frequency (%) |
| 4188 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 71196 | |
| Common | 4188 | 5.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 16752 | |
| a | 8376 | |
| n | 8376 | |
| t | 8376 | |
| g | 8376 | |
| C | 4188 | 5.9% |
| v | 4188 | 5.9% |
| l | 4188 | 5.9% |
| r | 4188 | 5.9% |
| e | 4188 | 5.9% |
Common
| Value | Count | Frequency (%) |
| 4188 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 75384 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 16752 | |
| a | 8376 | |
| n | 8376 | |
| t | 8376 | |
| g | 8376 | |
| C | 4188 | 5.6% |
| v | 4188 | 5.6% |
| l | 4188 | 5.6% |
| 4188 | 5.6% | |
| r | 4188 | 5.6% |
ISO
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| 804 | |
|---|---|
| 642 | 8 |
| 792 | 7 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 288246 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 642 |
|---|---|
| 2nd row | 642 |
| 3rd row | 642 |
| 4th row | 642 |
| 5th row | 642 |
Common Values
| Value | Count | Frequency (%) |
| 804 | 96067 | |
| 642 | 8 | < 0.1% |
| 792 | 7 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 804 | 96067 | |
| 642 | 8 | < 0.1% |
| 792 | 7 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 96075 | |
| 8 | 96067 | |
| 0 | 96067 | |
| 2 | 15 | < 0.1% |
| 6 | 8 | < 0.1% |
| 7 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 288246 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 96075 | |
| 8 | 96067 | |
| 0 | 96067 | |
| 2 | 15 | < 0.1% |
| 6 | 8 | < 0.1% |
| 7 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 288246 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 4 | 96075 | |
| 8 | 96067 | |
| 0 | 96067 | |
| 2 | 15 | < 0.1% |
| 6 | 8 | < 0.1% |
| 7 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 288246 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 96075 | |
| 8 | 96067 | |
| 0 | 96067 | |
| 2 | 15 | < 0.1% |
| 6 | 8 | < 0.1% |
| 7 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
REGION
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Europe | |
|---|---|
| Middle East | 7 |
Length
| Max length | 11 |
|---|---|
| Median length | 6 |
| Mean length | 6.0003643 |
| Min length | 6 |
Characters and Unicode
| Total characters | 576527 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Europe |
|---|---|
| 2nd row | Europe |
| 3rd row | Europe |
| 4th row | Europe |
| 5th row | Europe |
Common Values
| Value | Count | Frequency (%) |
| Europe | 96075 | |
| Middle East | 7 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| europe | 96075 | |
| middle | 7 | < 0.1% |
| east | 7 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 96082 | |
| e | 96082 | |
| u | 96075 | |
| r | 96075 | |
| o | 96075 | |
| p | 96075 | |
| d | 14 | < 0.1% |
| M | 7 | < 0.1% |
| i | 7 | < 0.1% |
| l | 7 | < 0.1% |
| Other values (4) | 28 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 480431 | |
| Uppercase Letter | 96089 | 16.7% |
| Space Separator | 7 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 96082 | |
| u | 96075 | |
| r | 96075 | |
| o | 96075 | |
| p | 96075 | |
| d | 14 | < 0.1% |
| i | 7 | < 0.1% |
| l | 7 | < 0.1% |
| a | 7 | < 0.1% |
| s | 7 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 96082 | |
| M | 7 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 7 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 576520 | |
| Common | 7 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 96082 | |
| e | 96082 | |
| u | 96075 | |
| r | 96075 | |
| o | 96075 | |
| p | 96075 | |
| d | 14 | < 0.1% |
| M | 7 | < 0.1% |
| i | 7 | < 0.1% |
| l | 7 | < 0.1% |
| Other values (3) | 21 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| 7 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 576527 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 96082 | |
| e | 96082 | |
| u | 96075 | |
| r | 96075 | |
| o | 96075 | |
| p | 96075 | |
| d | 14 | < 0.1% |
| M | 7 | < 0.1% |
| i | 7 | < 0.1% |
| l | 7 | < 0.1% |
| Other values (4) | 28 | < 0.1% |
COUNTRY
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Ukraine | |
|---|---|
| Romania | 8 |
| Turkey | 7 |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.9999271 |
| Min length | 6 |
Characters and Unicode
| Total characters | 672567 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Romania |
|---|---|
| 2nd row | Romania |
| 3rd row | Romania |
| 4th row | Romania |
| 5th row | Romania |
Common Values
| Value | Count | Frequency (%) |
| Ukraine | 96067 | |
| Romania | 8 | < 0.1% |
| Turkey | 7 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ukraine | 96067 | |
| romania | 8 | < 0.1% |
| turkey | 7 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 96083 | |
| i | 96075 | |
| n | 96075 | |
| k | 96074 | |
| r | 96074 | |
| e | 96074 | |
| U | 96067 | |
| R | 8 | < 0.1% |
| o | 8 | < 0.1% |
| m | 8 | < 0.1% |
| Other values (3) | 21 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 576485 | |
| Uppercase Letter | 96082 | 14.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 96083 | |
| i | 96075 | |
| n | 96075 | |
| k | 96074 | |
| r | 96074 | |
| e | 96074 | |
| o | 8 | < 0.1% |
| m | 8 | < 0.1% |
| u | 7 | < 0.1% |
| y | 7 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 96067 | |
| R | 8 | < 0.1% |
| T | 7 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 672567 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 96083 | |
| i | 96075 | |
| n | 96075 | |
| k | 96074 | |
| r | 96074 | |
| e | 96074 | |
| U | 96067 | |
| R | 8 | < 0.1% |
| o | 8 | < 0.1% |
| m | 8 | < 0.1% |
| Other values (3) | 21 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 672567 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 96083 | |
| i | 96075 | |
| n | 96075 | |
| k | 96074 | |
| r | 96074 | |
| e | 96074 | |
| U | 96067 | |
| R | 8 | < 0.1% |
| o | 8 | < 0.1% |
| m | 8 | < 0.1% |
| Other values (3) | 21 | < 0.1% |
ADMIN1
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 17 |
| Missing (%) | < 0.1% |
| Memory size | 750.8 KiB |
| Donetsk | |
|---|---|
| Luhansk | |
| Kharkiv | |
| Zaporizhia | 4866 |
| Kherson | 4206 |
| Other values (26) |
Length
| Max length | 15 |
|---|---|
| Median length | 7 |
| Mean length | 7.2538802 |
| Min length | 4 |
Characters and Unicode
| Total characters | 696844 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Constanta |
|---|---|
| 2nd row | Constanta |
| 3rd row | Constanta |
| 4th row | Constanta |
| 5th row | Constanta |
Common Values
| Value | Count | Frequency (%) |
| Donetsk | 52800 | |
| Luhansk | 13716 | 14.3% |
| Kharkiv | 7804 | 8.1% |
| Zaporizhia | 4866 | 5.1% |
| Kherson | 4206 | 4.4% |
| Kyiv City | 2227 | 2.3% |
| Sumy | 2116 | 2.2% |
| Mykolaiv | 1942 | 2.0% |
| Dnipropetrovsk | 1502 | 1.6% |
| Chernihiv | 882 | 0.9% |
| Other values (21) | 4004 | 4.2% |
Length
| Value | Count | Frequency (%) |
| donetsk | 52800 | |
| luhansk | 13716 | 14.0% |
| kharkiv | 7804 | 7.9% |
| zaporizhia | 4866 | 5.0% |
| kherson | 4206 | 4.3% |
| kyiv | 2983 | 3.0% |
| city | 2227 | 2.3% |
| sumy | 2116 | 2.2% |
| mykolaiv | 1942 | 2.0% |
| dnipropetrovsk | 1502 | 1.5% |
| Other values (21) | 4130 | 4.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| k | 78445 | |
| n | 74275 | |
| s | 73667 | |
| o | 67801 | |
| e | 61092 | |
| t | 57509 | |
| D | 54302 | |
| a | 35801 | 5.1% |
| h | 32951 | 4.7% |
| i | 30047 | 4.3% |
| Other values (27) | 130954 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 596015 | |
| Uppercase Letter | 98447 | 14.1% |
| Space Separator | 2227 | 0.3% |
| Dash Punctuation | 155 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| k | 78445 | |
| n | 74275 | |
| s | 73667 | |
| o | 67801 | |
| e | 61092 | |
| t | 57509 | |
| a | 35801 | |
| h | 32951 | |
| i | 30047 | 5.0% |
| r | 22184 | 3.7% |
| Other values (11) | 62243 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 54302 | |
| K | 15201 | 15.4% |
| L | 14248 | 14.5% |
| Z | 5195 | 5.3% |
| C | 3714 | 3.8% |
| S | 2117 | 2.2% |
| M | 1942 | 2.0% |
| O | 760 | 0.8% |
| V | 234 | 0.2% |
| P | 166 | 0.2% |
| Other values (4) | 568 | 0.6% |
Space Separator
| Value | Count | Frequency (%) |
| 2227 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 155 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 694462 | |
| Common | 2382 | 0.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| k | 78445 | |
| n | 74275 | |
| s | 73667 | |
| o | 67801 | |
| e | 61092 | |
| t | 57509 | |
| D | 54302 | |
| a | 35801 | 5.2% |
| h | 32951 | 4.7% |
| i | 30047 | 4.3% |
| Other values (25) | 128572 |
Common
| Value | Count | Frequency (%) |
| 2227 | ||
| - | 155 | 6.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 696844 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| k | 78445 | |
| n | 74275 | |
| s | 73667 | |
| o | 67801 | |
| e | 61092 | |
| t | 57509 | |
| D | 54302 | |
| a | 35801 | 5.1% |
| h | 32951 | 4.7% |
| i | 30047 | 4.3% |
| Other values (27) | 130954 |
ADMIN2
Categorical
| Distinct | 150 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 91 |
| Missing (%) | 0.1% |
| Memory size | 750.8 KiB |
| Bakhmutskyi | |
|---|---|
| Donetskyi | |
| Pokrovskyi | |
| Sievierodonetskyi | |
| Mariupolskyi | |
| Other values (145) |
Length
| Max length | 21 |
|---|---|
| Median length | 19 |
| Mean length | 10.885041 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1044866 |
|---|---|
| Distinct characters | 44 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Mihai Viteazu |
|---|---|
| 2nd row | Sinop |
| 3rd row | Sile |
| 4th row | Sariyer |
| 5th row | Demirkoy |
Common Values
| Value | Count | Frequency (%) |
| Bakhmutskyi | 10381 | 10.8% |
| Donetskyi | 9865 | 10.3% |
| Pokrovskyi | 8604 | 9.0% |
| Sievierodonetskyi | 6970 | 7.3% |
| Mariupolskyi | 6934 | 7.2% |
| Horlivskyi | 5514 | 5.7% |
| Kalmiuskyi | 4907 | 5.1% |
| Volnovaskyi | 4025 | 4.2% |
| Alchevskyi | 3962 | 4.1% |
| Polohivskyi | 3018 | 3.1% |
| Other values (140) | 31811 |
Length
| Value | Count | Frequency (%) |
| bakhmutskyi | 10381 | 10.8% |
| donetskyi | 9865 | 10.3% |
| pokrovskyi | 8604 | 9.0% |
| sievierodonetskyi | 6970 | 7.3% |
| mariupolskyi | 6934 | 7.2% |
| horlivskyi | 5514 | 5.7% |
| kalmiuskyi | 4907 | 5.1% |
| volnovaskyi | 4025 | 4.2% |
| alchevskyi | 3962 | 4.1% |
| polohivskyi | 3018 | 3.1% |
| Other values (141) | 31812 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 146752 | |
| k | 119753 | |
| y | 102917 | 9.8% |
| s | 99404 | 9.5% |
| o | 83011 | 7.9% |
| v | 49266 | 4.7% |
| a | 46479 | 4.4% |
| r | 44557 | 4.3% |
| e | 41702 | 4.0% |
| t | 34634 | 3.3% |
| Other values (34) | 276391 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 947781 | |
| Uppercase Letter | 96538 | 9.2% |
| Dash Punctuation | 546 | 0.1% |
| Space Separator | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 146752 | |
| k | 119753 | |
| y | 102917 | |
| s | 99404 | |
| o | 83011 | |
| v | 49266 | 5.2% |
| a | 46479 | 4.9% |
| r | 44557 | 4.7% |
| e | 41702 | 4.4% |
| t | 34634 | 3.7% |
| Other values (12) | 179306 |
Uppercase Letter
| Value | Count | Frequency (%) |
| K | 17125 | |
| B | 14186 | |
| P | 11846 | |
| S | 11307 | |
| D | 10333 | |
| M | 8492 | |
| H | 5605 | 5.8% |
| V | 5007 | 5.2% |
| A | 3962 | 4.1% |
| C | 2014 | 2.1% |
| Other values (10) | 6661 | 6.9% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 546 |
Space Separator
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1044319 | |
| Common | 547 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 146752 | |
| k | 119753 | |
| y | 102917 | 9.9% |
| s | 99404 | 9.5% |
| o | 83011 | 7.9% |
| v | 49266 | 4.7% |
| a | 46479 | 4.5% |
| r | 44557 | 4.3% |
| e | 41702 | 4.0% |
| t | 34634 | 3.3% |
| Other values (32) | 275844 |
Common
| Value | Count | Frequency (%) |
| - | 546 | |
| 1 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1044866 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 146752 | |
| k | 119753 | |
| y | 102917 | 9.8% |
| s | 99404 | 9.5% |
| o | 83011 | 7.9% |
| v | 49266 | 4.7% |
| a | 46479 | 4.4% |
| r | 44557 | 4.3% |
| e | 41702 | 4.0% |
| t | 34634 | 3.3% |
| Other values (34) | 276391 |
ADMIN3
Categorical
HIGH CARDINALITY  MISSING 
| Distinct | 776 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 2402 |
| Missing (%) | 2.5% |
| Memory size | 750.8 KiB |
| Sartanska | 6553 |
|---|---|
| Donetska | 4905 |
| Svitlodarska | 4742 |
| Yasynuvatska | 4603 |
| Horlivska | 4407 |
| Other values (771) |
Length
| Max length | 31 |
|---|---|
| Median length | 21 |
| Mean length | 10.366193 |
| Min length | 5 |
Characters and Unicode
| Total characters | 971105 |
|---|---|
| Distinct characters | 46 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 181 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | Debaltsivska |
|---|---|
| 2nd row | Svitlodarska |
| 3rd row | Sartanska |
| 4th row | Sartanska |
| 5th row | Yasynuvatska |
Common Values
| Value | Count | Frequency (%) |
| Sartanska | 6553 | 6.8% |
| Donetska | 4905 | 5.1% |
| Svitlodarska | 4742 | 4.9% |
| Yasynuvatska | 4603 | 4.8% |
| Horlivska | 4407 | 4.6% |
| Ocheretynska | 4088 | 4.3% |
| Hirska | 3910 | 4.1% |
| Kadiivska | 3370 | 3.5% |
| Novoazovska | 2687 | 2.8% |
| Marinska | 2372 | 2.5% |
| Other values (766) | 52043 | |
| (Missing) | 2402 | 2.5% |
Length
| Value | Count | Frequency (%) |
| sartanska | 6553 | 7.0% |
| donetska | 4905 | 5.2% |
| svitlodarska | 4742 | 5.1% |
| yasynuvatska | 4603 | 4.9% |
| horlivska | 4407 | 4.7% |
| ocheretynska | 4088 | 4.4% |
| hirska | 3910 | 4.2% |
| kadiivska | 3370 | 3.6% |
| novoazovska | 2687 | 2.9% |
| marinska | 2372 | 2.5% |
| Other values (768) | 52073 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 163786 | |
| s | 108344 | |
| k | 106979 | |
| i | 65530 | 6.7% |
| o | 58764 | 6.1% |
| v | 54891 | 5.7% |
| n | 49909 | 5.1% |
| r | 48780 | 5.0% |
| t | 37450 | 3.9% |
| e | 35843 | 3.7% |
| Other values (36) | 240829 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 875473 | |
| Uppercase Letter | 94656 | 9.7% |
| Dash Punctuation | 946 | 0.1% |
| Space Separator | 30 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 163786 | |
| s | 108344 | |
| k | 106979 | |
| i | 65530 | |
| o | 58764 | 6.7% |
| v | 54891 | 6.3% |
| n | 49909 | 5.7% |
| r | 48780 | 5.6% |
| t | 37450 | 4.3% |
| e | 35843 | 4.1% |
| Other values (13) | 145197 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 18326 | |
| H | 11008 | |
| D | 8980 | |
| K | 8206 | |
| O | 7262 | 7.7% |
| M | 6016 | 6.4% |
| V | 4856 | 5.1% |
| Y | 4816 | 5.1% |
| B | 4641 | 4.9% |
| N | 4351 | 4.6% |
| Other values (11) | 16194 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 946 |
Space Separator
| Value | Count | Frequency (%) |
| 30 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 970129 | |
| Common | 976 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 163786 | |
| s | 108344 | |
| k | 106979 | |
| i | 65530 | 6.8% |
| o | 58764 | 6.1% |
| v | 54891 | 5.7% |
| n | 49909 | 5.1% |
| r | 48780 | 5.0% |
| t | 37450 | 3.9% |
| e | 35843 | 3.7% |
| Other values (34) | 239853 |
Common
| Value | Count | Frequency (%) |
| - | 946 | |
| 30 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 971105 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 163786 | |
| s | 108344 | |
| k | 106979 | |
| i | 65530 | 6.7% |
| o | 58764 | 6.1% |
| v | 54891 | 5.7% |
| n | 49909 | 5.1% |
| r | 48780 | 5.0% |
| t | 37450 | 3.9% |
| e | 35843 | 3.7% |
| Other values (36) | 240829 |
LOCATION
Categorical
| Distinct | 2470 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Vodyane | 1483 |
|---|---|
| Avdiivka | 1167 |
| Donetsk Filtration Station | 1089 |
| Marinka | 1041 |
| Zaitseve | 901 |
| Other values (2465) |
Length
| Max length | 33 |
|---|---|
| Median length | 26 |
| Mean length | 10.216908 |
| Min length | 3 |
Characters and Unicode
| Total characters | 981661 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 647 ? |
|---|---|
| Unique (%) | 0.7% |
Sample
| 1st row | Coast of Constanta |
|---|---|
| 2nd row | Coast of Constanta |
| 3rd row | Coast of Constanta |
| 4th row | Coast of Constanta |
| 5th row | Coast of Constanta |
Common Values
| Value | Count | Frequency (%) |
| Vodyane | 1483 | 1.5% |
| Avdiivka | 1167 | 1.2% |
| Donetsk Filtration Station | 1089 | 1.1% |
| Marinka | 1041 | 1.1% |
| Zaitseve | 901 | 0.9% |
| Luhanske | 896 | 0.9% |
| Mineralne | 825 | 0.9% |
| Kominternove | 815 | 0.8% |
| Krasnohorivka | 768 | 0.8% |
| Pyshchevyk | 764 | 0.8% |
| Other values (2460) | 86333 |
Length
| Value | Count | Frequency (%) |
| donetsk | 4289 | 3.7% |
| 3825 | 3.3% | |
| kyiv | 2227 | 1.9% |
| vodyane | 1483 | 1.3% |
| avdiivka | 1167 | 1.0% |
| station | 1145 | 1.0% |
| filtration | 1089 | 0.9% |
| marinka | 1041 | 0.9% |
| balka | 966 | 0.8% |
| shakhta | 930 | 0.8% |
| Other values (2461) | 97351 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 98516 | 10.0% |
| o | 80078 | 8.2% |
| i | 75523 | 7.7% |
| k | 75413 | 7.7% |
| e | 73371 | 7.5% |
| v | 63043 | 6.4% |
| n | 55010 | 5.6% |
| r | 48666 | 5.0% |
| s | 44352 | 4.5% |
| y | 43635 | 4.4% |
| Other values (48) | 324054 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 842636 | |
| Uppercase Letter | 111820 | 11.4% |
| Space Separator | 19431 | 2.0% |
| Dash Punctuation | 6037 | 0.6% |
| Decimal Number | 1448 | 0.1% |
| Other Punctuation | 243 | < 0.1% |
| Open Punctuation | 23 | < 0.1% |
| Close Punctuation | 23 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 98516 | |
| o | 80078 | |
| i | 75523 | |
| k | 75413 | |
| e | 73371 | 8.7% |
| v | 63043 | 7.5% |
| n | 55010 | 6.5% |
| r | 48666 | 5.8% |
| s | 44352 | 5.3% |
| y | 43635 | 5.2% |
| Other values (14) | 185029 |
Uppercase Letter
| Value | Count | Frequency (%) |
| K | 15682 | |
| S | 13164 | |
| P | 8718 | 7.8% |
| N | 8484 | 7.6% |
| D | 8341 | 7.5% |
| M | 8012 | 7.2% |
| V | 7390 | 6.6% |
| B | 6570 | 5.9% |
| Z | 5567 | 5.0% |
| L | 4752 | 4.2% |
| Other values (13) | 25140 |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 579 | |
| 4 | 361 | |
| 7 | 243 | |
| 6 | 243 | |
| 1 | 21 | 1.5% |
| 2 | 1 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 19431 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 6037 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 243 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 23 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 23 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 954456 | |
| Common | 27205 | 2.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 98516 | 10.3% |
| o | 80078 | 8.4% |
| i | 75523 | 7.9% |
| k | 75413 | 7.9% |
| e | 73371 | 7.7% |
| v | 63043 | 6.6% |
| n | 55010 | 5.8% |
| r | 48666 | 5.1% |
| s | 44352 | 4.6% |
| y | 43635 | 4.6% |
| Other values (37) | 296849 |
Common
| Value | Count | Frequency (%) |
| 19431 | ||
| - | 6037 | 22.2% |
| 5 | 579 | 2.1% |
| 4 | 361 | 1.3% |
| 7 | 243 | 0.9% |
| / | 243 | 0.9% |
| 6 | 243 | 0.9% |
| ( | 23 | 0.1% |
| ) | 23 | 0.1% |
| 1 | 21 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 981661 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 98516 | 10.0% |
| o | 80078 | 8.2% |
| i | 75523 | 7.7% |
| k | 75413 | 7.7% |
| e | 73371 | 7.5% |
| v | 63043 | 6.4% |
| n | 55010 | 5.6% |
| r | 48666 | 5.0% |
| s | 44352 | 4.5% |
| y | 43635 | 4.4% |
| Other values (48) | 324054 |
LATITUDE
Real number (ℝ)
| Distinct | 2316 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48.370409 |
| Minimum | 41.162 |
|---|---|
| Maximum | 52.341 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 41.162 |
|---|---|
| 5-th percentile | 46.921 |
| Q1 | 47.716 |
| median | 48.268 |
| Q3 | 48.719 |
| 95-th percentile | 50.45 |
| Maximum | 52.341 |
| Range | 11.179 |
| Interquartile range (IQR) | 1.003 |
Descriptive statistics
| Standard deviation | 1.0842564 |
|---|---|
| Coefficient of variation (CV) | 0.022415696 |
| Kurtosis | 1.542988 |
| Mean | 48.370409 |
| Median Absolute Deviation (MAD) | 0.481 |
| Skewness | 0.81903324 |
| Sum | 4647525.7 |
| Variance | 1.1756119 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 47.148 | 1267 | 1.3% |
| 48.139 | 1217 | 1.3% |
| 48.133 | 1089 | 1.1% |
| 47.943 | 1043 | 1.1% |
| 47.127 | 987 | 1.0% |
| 48.431 | 920 | 1.0% |
| 48.429 | 902 | 0.9% |
| 48.1 | 826 | 0.9% |
| 47.175 | 815 | 0.8% |
| 50.427 | 809 | 0.8% |
| Other values (2306) | 86207 |
| Value | Count | Frequency (%) |
| 41.162 | 1 | < 0.1% |
| 41.24 | 1 | < 0.1% |
| 41.253 | 3 | |
| 41.852 | 1 | < 0.1% |
| 42.039 | 1 | < 0.1% |
| 43.389 | 7 | |
| 43.517 | 5 | |
| 44.156 | 7 | |
| 44.407 | 1 | < 0.1% |
| 44.416 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 52.341 | 3 | < 0.1% |
| 52.338 | 4 | < 0.1% |
| 52.334 | 41 | |
| 52.328 | 32 | |
| 52.32 | 1 | < 0.1% |
| 52.317 | 5 | < 0.1% |
| 52.313 | 25 | |
| 52.31 | 3 | < 0.1% |
| 52.308 | 2 | < 0.1% |
| 52.306 | 13 | < 0.1% |
LONGITUDE
Real number (ℝ)
| Distinct | 2596 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.711842 |
| Minimum | 22.163 |
|---|---|
| Maximum | 40.132 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 22.163 |
|---|---|
| 5-th percentile | 30.734 |
| Q1 | 36.489 |
| median | 37.779 |
| Q3 | 37.993 |
| 95-th percentile | 38.635 |
| Maximum | 40.132 |
| Range | 17.969 |
| Interquartile range (IQR) | 1.504 |
Descriptive statistics
| Standard deviation | 2.5746416 |
|---|---|
| Coefficient of variation (CV) | 0.070131092 |
| Kurtosis | 6.3442494 |
| Mean | 36.711842 |
| Median Absolute Deviation (MAD) | 0.381 |
| Skewness | -2.3712511 |
| Sum | 3527347.2 |
| Variance | 6.6287791 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 37.75 | 1708 | 1.8% |
| 37.788 | 1400 | 1.5% |
| 37.808 | 1089 | 1.1% |
| 37.792 | 1055 | 1.1% |
| 37.825 | 1054 | 1.1% |
| 37.505 | 1041 | 1.1% |
| 38.009 | 901 | 0.9% |
| 37.86 | 900 | 0.9% |
| 37.809 | 858 | 0.9% |
| 37.571 | 835 | 0.9% |
| Other values (2586) | 85241 |
| Value | Count | Frequency (%) |
| 22.163 | 3 | < 0.1% |
| 22.206 | 3 | < 0.1% |
| 22.246 | 1 | < 0.1% |
| 22.3 | 60 | |
| 22.389 | 1 | < 0.1% |
| 22.393 | 1 | < 0.1% |
| 22.443 | 4 | < 0.1% |
| 22.46 | 2 | < 0.1% |
| 22.594 | 1 | < 0.1% |
| 22.596 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 40.132 | 2 | < 0.1% |
| 39.796 | 1 | < 0.1% |
| 39.747 | 5 | |
| 39.739 | 2 | < 0.1% |
| 39.697 | 2 | < 0.1% |
| 39.689 | 1 | < 0.1% |
| 39.674 | 1 | < 0.1% |
| 39.67 | 4 | |
| 39.668 | 1 | < 0.1% |
| 39.667 | 5 |
GEO_PRECISION
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| 2 | |
|---|---|
| 1 | |
| 3 | 1309 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 96082 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 69328 | |
| 1 | 25445 | 26.5% |
| 3 | 1309 | 1.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2 | 69328 | |
| 1 | 25445 | 26.5% |
| 3 | 1309 | 1.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 69328 | |
| 1 | 25445 | 26.5% |
| 3 | 1309 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 96082 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 69328 | |
| 1 | 25445 | 26.5% |
| 3 | 1309 | 1.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 96082 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 69328 | |
| 1 | 25445 | 26.5% |
| 3 | 1309 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 96082 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 69328 | |
| 1 | 25445 | 26.5% |
| 3 | 1309 | 1.4% |
SOURCE
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 4819 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| OSCE SMM-Ukraine | |
|---|---|
| Ministry of Defence of Ukraine | |
| DPR Armed Forces Press Service | |
| 24 Channel | |
| Suspilne Media | 3238 |
| Other values (4814) |
Length
| Max length | 288 |
|---|---|
| Median length | 214 |
| Mean length | 28.496971 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2738046 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3635 ? |
|---|---|
| Unique (%) | 3.8% |
Sample
| 1st row | Deschide; Hurriyet Daily; News.ro; CNN; TRT Haber |
|---|---|
| 2nd row | Adevarul; G4media |
| 3rd row | News.ro |
| 4th row | Digi24 |
| 5th row | News.ro |
Common Values
| Value | Count | Frequency (%) |
| OSCE SMM-Ukraine | 23075 | |
| Ministry of Defence of Ukraine | 21204 | |
| DPR Armed Forces Press Service | 8906 | 9.3% |
| 24 Channel | 3741 | 3.9% |
| Suspilne Media | 3238 | 3.4% |
| Institute for the Study of War | 2226 | 2.3% |
| LPR People's Militia Press Service | 2174 | 2.3% |
| JFO HQ press centre | 2159 | 2.2% |
| Ministry of Defence of Ukraine; JFO HQ press centre | 1998 | 2.1% |
| JFO HQ press centre; Ministry of Defence of Ukraine | 1956 | 2.0% |
| Other values (4809) | 25405 |
Length
| Value | Count | Frequency (%) |
| of | 68007 | |
| ukraine | 36089 | 8.5% |
| ministry | 32355 | 7.6% |
| defence | 32352 | 7.6% |
| osce | 28159 | 6.6% |
| smm-ukraine | 28159 | 6.6% |
| press | 24547 | 5.8% |
| service | 14125 | 3.3% |
| centre | 11780 | 2.8% |
| forces | 10637 | 2.5% |
| Other values (439) | 139168 |
Most occurring characters
| Value | Count | Frequency (%) |
| 329296 | 12.0% | |
| e | 316704 | 11.6% |
| r | 199924 | 7.3% |
| i | 185867 | 6.8% |
| n | 184795 | 6.7% |
| s | 116640 | 4.3% |
| o | 114675 | 4.2% |
| a | 110536 | 4.0% |
| f | 108453 | 4.0% |
| M | 98089 | 3.6% |
| Other values (59) | 973067 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1760513 | |
| Uppercase Letter | 563224 | 20.6% |
| Space Separator | 329296 | 12.0% |
| Other Punctuation | 38826 | 1.4% |
| Dash Punctuation | 29394 | 1.1% |
| Decimal Number | 16787 | 0.6% |
| Open Punctuation | 3 | < 0.1% |
| Close Punctuation | 3 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 98089 | |
| S | 84035 | |
| U | 70218 | |
| D | 44739 | |
| O | 39376 | |
| C | 38271 | 6.8% |
| P | 33361 | 5.9% |
| E | 28447 | 5.1% |
| F | 22251 | 4.0% |
| R | 18111 | 3.2% |
| Other values (16) | 86326 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 316704 | |
| r | 199924 | |
| i | 185867 | |
| n | 184795 | |
| s | 116640 | 6.6% |
| o | 114675 | 6.5% |
| a | 110536 | 6.3% |
| f | 108453 | 6.2% |
| t | 74775 | 4.2% |
| k | 71863 | 4.1% |
| Other values (15) | 276281 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 6285 | |
| 4 | 6046 | |
| 1 | 1446 | 8.6% |
| 0 | 1421 | 8.5% |
| 6 | 1190 | 7.1% |
| 7 | 281 | 1.7% |
| 9 | 58 | 0.3% |
| 5 | 50 | 0.3% |
| 3 | 10 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| ; | 32967 | |
| ' | 3318 | 8.5% |
| . | 2094 | 5.4% |
| : | 256 | 0.7% |
| / | 191 | 0.5% |
Space Separator
| Value | Count | Frequency (%) |
| 329296 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 29394 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2323737 | |
| Common | 414309 | 15.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 316704 | 13.6% |
| r | 199924 | 8.6% |
| i | 185867 | 8.0% |
| n | 184795 | 8.0% |
| s | 116640 | 5.0% |
| o | 114675 | 4.9% |
| a | 110536 | 4.8% |
| f | 108453 | 4.7% |
| M | 98089 | 4.2% |
| S | 84035 | 3.6% |
| Other values (41) | 804019 |
Common
| Value | Count | Frequency (%) |
| 329296 | ||
| ; | 32967 | 8.0% |
| - | 29394 | 7.1% |
| 2 | 6285 | 1.5% |
| 4 | 6046 | 1.5% |
| ' | 3318 | 0.8% |
| . | 2094 | 0.5% |
| 1 | 1446 | 0.3% |
| 0 | 1421 | 0.3% |
| 6 | 1190 | 0.3% |
| Other values (8) | 852 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2738046 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 329296 | 12.0% | |
| e | 316704 | 11.6% |
| r | 199924 | 7.3% |
| i | 185867 | 6.8% |
| n | 184795 | 6.7% |
| s | 116640 | 4.3% |
| o | 114675 | 4.2% |
| a | 110536 | 4.0% |
| f | 108453 | 4.0% |
| M | 98089 | 3.6% |
| Other values (59) | 973067 |
SOURCE_SCALE
Categorical
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Other | |
|---|---|
| National | |
| Other-National | 4671 |
| Subnational | 2600 |
| Local partner-New media | 1851 |
| Other values (13) | 3587 |
Length
| Max length | 25 |
|---|---|
| Median length | 5 |
| Mean length | 6.7987552 |
| Min length | 5 |
Characters and Unicode
| Total characters | 653238 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | National-International |
|---|---|
| 2nd row | National |
| 3rd row | National |
| 4th row | National |
| 5th row | National |
Common Values
| Value | Count | Frequency (%) |
| Other | 70390 | |
| National | 12983 | 13.5% |
| Other-National | 4671 | 4.9% |
| Subnational | 2600 | 2.7% |
| Local partner-New media | 1851 | 1.9% |
| Other-Subnational | 1149 | 1.2% |
| International | 519 | 0.5% |
| New media | 407 | 0.4% |
| Subnational-International | 305 | 0.3% |
| Other-International | 302 | 0.3% |
| Other values (8) | 905 | 0.9% |
Length
| Value | Count | Frequency (%) |
| other | 70390 | |
| national | 12983 | 12.9% |
| other-national | 4671 | 4.6% |
| subnational | 2600 | 2.6% |
| media | 2411 | 2.4% |
| local | 1857 | 1.8% |
| partner-new | 1851 | 1.8% |
| other-subnational | 1149 | 1.1% |
| new | 630 | 0.6% |
| international | 519 | 0.5% |
| Other values (10) | 1512 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 104035 | |
| e | 85203 | |
| r | 81791 | |
| O | 76671 | |
| h | 76671 | |
| a | 54551 | |
| n | 33107 | 5.1% |
| i | 26736 | 4.1% |
| o | 25959 | 4.0% |
| l | 25959 | 4.0% |
| Other values (15) | 62555 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 534301 | |
| Uppercase Letter | 105264 | 16.1% |
| Dash Punctuation | 9182 | 1.4% |
| Space Separator | 4491 | 0.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 104035 | |
| e | 85203 | |
| r | 81791 | |
| h | 76671 | |
| a | 54551 | |
| n | 33107 | 6.2% |
| i | 26736 | 5.0% |
| o | 25959 | 4.9% |
| l | 25959 | 4.9% |
| b | 4336 | 0.8% |
| Other values (7) | 15953 | 3.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 76671 | |
| N | 20993 | 19.9% |
| S | 4336 | 4.1% |
| L | 1857 | 1.8% |
| I | 1406 | 1.3% |
| R | 1 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 9182 |
Space Separator
| Value | Count | Frequency (%) |
| 4491 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 639565 | |
| Common | 13673 | 2.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 104035 | |
| e | 85203 | |
| r | 81791 | |
| O | 76671 | |
| h | 76671 | |
| a | 54551 | |
| n | 33107 | 5.2% |
| i | 26736 | 4.2% |
| o | 25959 | 4.1% |
| l | 25959 | 4.1% |
| Other values (13) | 48882 |
Common
| Value | Count | Frequency (%) |
| - | 9182 | |
| 4491 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 653238 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 104035 | |
| e | 85203 | |
| r | 81791 | |
| O | 76671 | |
| h | 76671 | |
| a | 54551 | |
| n | 33107 | 5.1% |
| i | 26736 | 4.1% |
| o | 25959 | 4.0% |
| l | 25959 | 4.0% |
| Other values (15) | 62555 |
NOTES
Categorical
HIGH CARDINALITY  UNIFORM 
| Distinct | 95577 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 750.8 KiB |
| Between 1 and 20 September, several dozen Donbass veterans, miners and local activists were blocking day and night a railway station in Sosnivka, Lviv, in order to prevent Russian imported coal from being delivered to a local factory. | 20 |
|---|---|
| On 28 May 2018, NAF rebel forces employed grenade launchers of various types and small arms against positions of Military forces of Ukraine near Verknyotoretske, Vodyane, Hnutove, Krasnohorivka, Marinka, Opytne, Pavlopil, Pisky, Shyrokyne and Butivka mine. During the day six Ukrainian Government soldiers were wounded and, according to intelligence reports, three NAF rebels were killed and five wounded at unspecified locations.[3 fatalities split among 18 events]. | 9 |
| On 8 May 2018, NAF rebel forces used heavy machine guns, grenade launchers of various types and small arms to fire upon Military Forces of Ukraine positions near Pyshchevyk, Shyrokyne, Vodyane, Verknyotoretske, Hnutove, Krasnohorivka, Opytne, Pisky, Marinka, Pavlopil and Kamyanka. During the day four Ukrainian Government soldiers were wounded and, according to intelligence reports, one NAF rebel was killed and five wounded at unspecified locations. [1 fatality split among 21 events]. | 9 |
| On 27 May 2018, NAF rebel forces employed grenade launchers of various types and infantry fighting vehicles against positions of Military forces of Ukraine near Vodyane, Marinka, Shyrokyne, Nevelske, Krasnohorivka, Talakivka, Pisky, Verknyotoretske and Butivka mine. During the day two Ukrainian Government soldiers were wounded and, according to intelligence reports, two NAF rebels were wounded at unspecified locations. | 9 |
| Displacement: On 8 April 2022, 3544 people were evacuated from Polohy, Vasylivka, Berdiansk, Tokmak, Melitopol, Enerhodar, Orikhiv, Huliaipole, Zaporizhia. | 8 |
| Other values (95572) |
Length
| Max length | 931 |
|---|---|
| Median length | 597 |
| Mean length | 155.81735 |
| Min length | 62 |
Characters and Unicode
| Total characters | 14971243 |
|---|---|
| Distinct characters | 83 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 95346 ? |
|---|---|
| Unique (%) | 99.2% |
Sample
| 1st row | On 20 May 2019, the Coast Guard of Romania fired at a Turkish fishing boat allegedly fishing in Romania's Exclusive Economic Zone (EEZ), 52 sea miles east of Costinesti [coded to Coast of Constanta]. Three Turkish fishers were reported to wounded. |
|---|---|
| 2nd row | Defusal: On 28 March 2022, Romanian minesweepers conducted a controlled detonation of a sea mine 70km away from the Midia port off the Coast of Constanta. The mine was most likely strayed from a location near Ukraine. |
| 3rd row | On 28 July 2022, Greenpeace activists protested in front of an oil tanker off the Coast of Constanta, demanding a stop to the dependence on fossil fuels due to their effect on the climate. |
| 4th row | Defusal: On 31 July 2022, Romanian Naval Forces defused and destroyed a YAM sea mine off the Coast of Constanta. The mine likely came from the Northern Black Sea region from the Ukrainian coast. |
| 5th row | On 4 August 2022, Greenpeace activists protested in front of an oil tanker off the Coast of Constanta, demanding a stop to the dependence on fossil fuels due to their effect on the climate. |
Common Values
| Value | Count | Frequency (%) |
| Between 1 and 20 September, several dozen Donbass veterans, miners and local activists were blocking day and night a railway station in Sosnivka, Lviv, in order to prevent Russian imported coal from being delivered to a local factory. | 20 | < 0.1% |
| On 28 May 2018, NAF rebel forces employed grenade launchers of various types and small arms against positions of Military forces of Ukraine near Verknyotoretske, Vodyane, Hnutove, Krasnohorivka, Marinka, Opytne, Pavlopil, Pisky, Shyrokyne and Butivka mine. During the day six Ukrainian Government soldiers were wounded and, according to intelligence reports, three NAF rebels were killed and five wounded at unspecified locations.[3 fatalities split among 18 events]. | 9 | < 0.1% |
| On 8 May 2018, NAF rebel forces used heavy machine guns, grenade launchers of various types and small arms to fire upon Military Forces of Ukraine positions near Pyshchevyk, Shyrokyne, Vodyane, Verknyotoretske, Hnutove, Krasnohorivka, Opytne, Pisky, Marinka, Pavlopil and Kamyanka. During the day four Ukrainian Government soldiers were wounded and, according to intelligence reports, one NAF rebel was killed and five wounded at unspecified locations. [1 fatality split among 21 events]. | 9 | < 0.1% |
| On 27 May 2018, NAF rebel forces employed grenade launchers of various types and infantry fighting vehicles against positions of Military forces of Ukraine near Vodyane, Marinka, Shyrokyne, Nevelske, Krasnohorivka, Talakivka, Pisky, Verknyotoretske and Butivka mine. During the day two Ukrainian Government soldiers were wounded and, according to intelligence reports, two NAF rebels were wounded at unspecified locations. | 9 | < 0.1% |
| Displacement: On 8 April 2022, 3544 people were evacuated from Polohy, Vasylivka, Berdiansk, Tokmak, Melitopol, Enerhodar, Orikhiv, Huliaipole, Zaporizhia. | 8 | < 0.1% |
| On 24 February 2019, activists held protests in ten Ukrainian cities, including Kiev, Zhytomyr, Lviv, Kharkiv, Zaporizhia, Ternopil, Dnipro and Odessa, against the use of animal fur and demanding that authorities adopt a law prohibiting fur farms. | 8 | < 0.1% |
| On 24 May 2018, NAF rebel forces employed 120mm mortars, infantry fighting vehicles, grenade launchers and small arms to attack Military Forces of Ukraine positions near Vodyane, Berezove, Hnutove, Marinka, Pavlopil, Pisky, Troitske, Chermalyk, Shyrokyne and Butivka mine. During the day two Ukrainian Government soldiers were wounded at unspecified locations. | 8 | < 0.1% |
| On 26 May 2018, NAF rebel forces employed infantry fighting vehicles against positions of Military forces of Ukraine near Pavlopil, Shyrokyne, Marinka, Vodyane, Kamyanka, Opytne, Hnutove and Pisky. During the day two Ukrainian Government soldiers were wounded and, according to intelligence reports, two NAF rebels were wounded at unspecified locations. | 8 | < 0.1% |
| On 9 May 2018, Military Forces of Ukraine targeted with unknown weapons Dokuchayevsk and the outlying Petrovskoye village, the Yasinovataya area and the southern villages of Novolaspa, Kominternovo, Leninskoye, Oktyabr and Sosnovskoye. | 8 | < 0.1% |
| On 6 May 2018, Military Forces of Ukraine targeted with mortars, grenade launchers and small arms northern suburbs of Donetsk, Yasinovataya, Krutaya Balka, Vasiliyevka, Mikhailovo, Dolomitnoye, Dokuchayevsk and Styla. | 8 | < 0.1% |
| Other values (95567) | 95987 |
Length
| Value | Count | Frequency (%) |
| the | 116687 | 5.1% |
| on | 99144 | 4.3% |
| of | 90183 | 3.9% |
| forces | 76516 | 3.3% |
| near | 62920 | 2.7% |
| unknown | 53443 | 2.3% |
| a | 42830 | 1.9% |
| and | 42056 | 1.8% |
| russian | 41737 | 1.8% |
| in | 39184 | 1.7% |
| Other values (13753) | 1634995 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2203982 | ||
| e | 1225581 | 8.2% |
| n | 1090749 | 7.3% |
| a | 981227 | 6.6% |
| s | 854758 | 5.7% |
| o | 845120 | 5.6% |
| i | 837128 | 5.6% |
| r | 761679 | 5.1% |
| t | 670374 | 4.5% |
| l | 492435 | 3.3% |
| Other values (73) | 5008210 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10767422 | |
| Space Separator | 2203982 | 14.7% |
| Uppercase Letter | 888100 | 5.9% |
| Decimal Number | 670000 | 4.5% |
| Other Punctuation | 409222 | 2.7% |
| Dash Punctuation | 14377 | 0.1% |
| Close Punctuation | 9065 | 0.1% |
| Open Punctuation | 9065 | 0.1% |
| Math Symbol | 8 | < 0.1% |
| Currency Symbol | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1225581 | |
| n | 1090749 | |
| a | 981227 | 9.1% |
| s | 854758 | 7.9% |
| o | 845120 | 7.8% |
| i | 837128 | 7.8% |
| r | 761679 | 7.1% |
| t | 670374 | 6.2% |
| l | 492435 | 4.6% |
| u | 346868 | 3.2% |
| Other values (16) | 2661503 |
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 139239 | |
| M | 104238 | |
| S | 86239 | |
| C | 71569 | 8.1% |
| D | 69558 | 7.8% |
| R | 55163 | 6.2% |
| U | 43579 | 4.9% |
| F | 43412 | 4.9% |
| A | 38350 | 4.3% |
| N | 36125 | 4.1% |
| Other values (16) | 200628 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 207168 | |
| . | 162570 | |
| / | 24564 | 6.0% |
| ' | 12968 | 3.2% |
| : | 1797 | 0.4% |
| ; | 94 | < 0.1% |
| ! | 22 | < 0.1% |
| % | 21 | < 0.1% |
| # | 10 | < 0.1% |
| ? | 4 | < 0.1% |
| Other values (2) | 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 266356 | |
| 0 | 131290 | |
| 1 | 114977 | |
| 8 | 34760 | 5.2% |
| 3 | 32210 | 4.8% |
| 9 | 28348 | 4.2% |
| 5 | 18022 | 2.7% |
| 4 | 16411 | 2.4% |
| 6 | 14026 | 2.1% |
| 7 | 13600 | 2.0% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 4756 | |
| ] | 4309 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 4755 | |
| [ | 4310 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 6 | |
| = | 2 | 25.0% |
Space Separator
| Value | Count | Frequency (%) |
| 2203982 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 14377 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11655522 | |
| Common | 3315721 | 22.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1225581 | 10.5% |
| n | 1090749 | 9.4% |
| a | 981227 | 8.4% |
| s | 854758 | 7.3% |
| o | 845120 | 7.3% |
| i | 837128 | 7.2% |
| r | 761679 | 6.5% |
| t | 670374 | 5.8% |
| l | 492435 | 4.2% |
| u | 346868 | 3.0% |
| Other values (42) | 3549603 |
Common
| Value | Count | Frequency (%) |
| 2203982 | ||
| 2 | 266356 | 8.0% |
| , | 207168 | 6.2% |
| . | 162570 | 4.9% |
| 0 | 131290 | 4.0% |
| 1 | 114977 | 3.5% |
| 8 | 34760 | 1.0% |
| 3 | 32210 | 1.0% |
| 9 | 28348 | 0.9% |
| / | 24564 | 0.7% |
| Other values (21) | 109496 | 3.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14971243 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2203982 | ||
| e | 1225581 | 8.2% |
| n | 1090749 | 7.3% |
| a | 981227 | 6.6% |
| s | 854758 | 5.7% |
| o | 845120 | 5.6% |
| i | 837128 | 5.6% |
| r | 761679 | 5.1% |
| t | 670374 | 4.5% |
| l | 492435 | 3.3% |
| Other values (73) | 5008210 |
FATALITIES
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 106 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.44319435 |
| Minimum | 0 |
|---|---|
| Maximum | 600 |
| Zeros | 91308 |
| Zeros (%) | 95.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 600 |
| Range | 600 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 6.3160898 |
|---|---|
| Coefficient of variation (CV) | 14.251287 |
| Kurtosis | 2807.4677 |
| Mean | 0.44319435 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 43.394402 |
| Sum | 42583 |
| Variance | 39.892991 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 91308 | |
| 1 | 2441 | 2.5% |
| 2 | 560 | 0.6% |
| 10 | 364 | 0.4% |
| 3 | 328 | 0.3% |
| 4 | 109 | 0.1% |
| 5 | 85 | 0.1% |
| 6 | 67 | 0.1% |
| 30 | 51 | 0.1% |
| 7 | 48 | < 0.1% |
| Other values (96) | 721 | 0.8% |
| Value | Count | Frequency (%) |
| 0 | 91308 | |
| 1 | 2441 | 2.5% |
| 2 | 560 | 0.6% |
| 3 | 328 | 0.3% |
| 4 | 109 | 0.1% |
| 5 | 85 | 0.1% |
| 6 | 67 | 0.1% |
| 7 | 48 | < 0.1% |
| 8 | 35 | < 0.1% |
| 9 | 25 | < 0.1% |
| Value | Count | Frequency (%) |
| 600 | 1 | < 0.1% |
| 500 | 2 | < 0.1% |
| 485 | 1 | < 0.1% |
| 400 | 2 | < 0.1% |
| 300 | 4 | |
| 250 | 2 | < 0.1% |
| 221 | 1 | < 0.1% |
| 220 | 1 | < 0.1% |
| 200 | 6 | |
| 180 | 2 | < 0.1% |
TAGS
Categorical
HIGH CARDINALITY  IMBALANCE  MISSING 
| Distinct | 358 |
|---|---|
| Distinct (%) | 6.0% |
| Missing | 90144 |
| Missing (%) | 93.8% |
| Memory size | 750.8 KiB |
| crowd size=no report | |
|---|---|
| crowd size=several dozen | 240 |
| crowd size=about 100 | 213 |
| crowd size=about 50 | 147 |
| crowd size=about 200 | 107 |
| Other values (353) |
Length
| Max length | 120 |
|---|---|
| Median length | 20 |
| Mean length | 20.494611 |
| Min length | 12 |
Characters and Unicode
| Total characters | 121697 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 190 ? |
|---|---|
| Unique (%) | 3.2% |
Sample
| 1st row | crowd size=no report |
|---|---|
| 2nd row | crowd size=no report |
| 3rd row | crowd size=about 200 |
| 4th row | crowd size=more than 100 |
| 5th row | crowd size=more than 200 |
Common Values
| Value | Count | Frequency (%) |
| crowd size=no report | 3419 | 3.6% |
| crowd size=several dozen | 240 | 0.2% |
| crowd size=about 100 | 213 | 0.2% |
| crowd size=about 50 | 147 | 0.2% |
| crowd size=about 200 | 107 | 0.1% |
| crowd size=about 30 | 98 | 0.1% |
| crowd size=about 20 | 87 | 0.1% |
| crowd size=several hundred | 83 | 0.1% |
| crowd size=hundreds | 72 | 0.1% |
| crowd size=dozens | 64 | 0.1% |
| Other values (348) | 1408 | 1.5% |
| (Missing) | 90144 |
Length
| Value | Count | Frequency (%) |
| crowd | 5907 | |
| size=no | 3422 | |
| report | 3421 | |
| size=about | 1360 | 7.5% |
| size=several | 364 | 2.0% |
| dozen | 363 | 2.0% |
| 100 | 286 | 1.6% |
| 50 | 183 | 1.0% |
| 200 | 147 | 0.8% |
| than | 142 | 0.8% |
| Other values (246) | 2460 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 15026 | |
| r | 13703 | |
| 12117 | 10.0% | |
| e | 11267 | 9.3% |
| d | 6883 | 5.7% |
| s | 6700 | 5.5% |
| z | 6337 | 5.2% |
| w | 6005 | 4.9% |
| i | 6001 | 4.9% |
| c | 5943 | 4.9% |
| Other values (31) | 31715 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 98470 | |
| Space Separator | 12117 | 10.0% |
| Math Symbol | 5907 | 4.9% |
| Decimal Number | 4985 | 4.1% |
| Dash Punctuation | 149 | 0.1% |
| Other Punctuation | 69 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 15026 | |
| r | 13703 | |
| e | 11267 | |
| d | 6883 | |
| s | 6700 | |
| z | 6337 | |
| w | 6005 | 6.1% |
| i | 6001 | 6.1% |
| c | 5943 | 6.0% |
| t | 5421 | 5.5% |
| Other values (13) | 15184 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2755 | |
| 1 | 643 | 12.9% |
| 5 | 535 | 10.7% |
| 2 | 411 | 8.2% |
| 3 | 300 | 6.0% |
| 4 | 142 | 2.8% |
| 7 | 76 | 1.5% |
| 6 | 73 | 1.5% |
| 8 | 37 | 0.7% |
| 9 | 13 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 37 | |
| / | 18 | |
| ; | 6 | 8.7% |
| , | 6 | 8.7% |
| . | 2 | 2.9% |
Space Separator
| Value | Count | Frequency (%) |
| 12117 |
Math Symbol
| Value | Count | Frequency (%) |
| = | 5907 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 149 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 98470 | |
| Common | 23227 | 19.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 15026 | |
| r | 13703 | |
| e | 11267 | |
| d | 6883 | |
| s | 6700 | |
| z | 6337 | |
| w | 6005 | 6.1% |
| i | 6001 | 6.1% |
| c | 5943 | 6.0% |
| t | 5421 | 5.5% |
| Other values (13) | 15184 |
Common
| Value | Count | Frequency (%) |
| 12117 | ||
| = | 5907 | |
| 0 | 2755 | 11.9% |
| 1 | 643 | 2.8% |
| 5 | 535 | 2.3% |
| 2 | 411 | 1.8% |
| 3 | 300 | 1.3% |
| - | 149 | 0.6% |
| 4 | 142 | 0.6% |
| 7 | 76 | 0.3% |
| Other values (8) | 192 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 121697 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 15026 | |
| r | 13703 | |
| 12117 | 10.0% | |
| e | 11267 | 9.3% |
| d | 6883 | 5.7% |
| s | 6700 | 5.5% |
| z | 6337 | 5.2% |
| w | 6005 | 4.9% |
| i | 6001 | 4.9% |
| c | 5943 | 4.9% |
| Other values (31) | 31715 |
TIMESTAMP
Real number (ℝ)
| Distinct | 8489 |
|---|---|
| Distinct (%) | 8.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.6478314 × 109 |
| Minimum | 1.5711644 × 109 |
|---|---|
| Maximum | 1.6794373 × 109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 750.8 KiB |
Quantile statistics
| Minimum | 1.5711644 × 109 |
|---|---|
| 5-th percentile | 1.6185016 × 109 |
| Q1 | 1.6189499 × 109 |
| median | 1.6498755 × 109 |
| Q3 | 1.6643009 × 109 |
| 95-th percentile | 1.6770015 × 109 |
| Maximum | 1.6794373 × 109 |
| Range | 1.0827288 × 108 |
| Interquartile range (IQR) | 45351025 |
Descriptive statistics
| Standard deviation | 20372761 |
|---|---|
| Coefficient of variation (CV) | 0.012363377 |
| Kurtosis | -1.2179228 |
| Mean | 1.6478314 × 109 |
| Median Absolute Deviation (MAD) | 16312729 |
| Skewness | -0.30228492 |
| Sum | 1.5832693 × 1014 |
| Variance | 4.150494 × 1014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1658857905 | 495 | 0.5% |
| 1658249348 | 472 | 0.5% |
| 1660674709 | 437 | 0.5% |
| 1659462993 | 425 | 0.4% |
| 1660055881 | 419 | 0.4% |
| 1675798464 | 408 | 0.4% |
| 1656435887 | 402 | 0.4% |
| 1675191967 | 394 | 0.4% |
| 1664300891 | 393 | 0.4% |
| 1673376406 | 387 | 0.4% |
| Other values (8479) | 91850 |
| Value | Count | Frequency (%) |
| 1571164407 | 6 | |
| 1572403627 | 1 | < 0.1% |
| 1572403774 | 1 | < 0.1% |
| 1580849930 | 1 | < 0.1% |
| 1606149144 | 1 | < 0.1% |
| 1618436427 | 1 | < 0.1% |
| 1618436428 | 3 | |
| 1618436429 | 5 | |
| 1618436430 | 5 | |
| 1618436431 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 1679437290 | 13 | < 0.1% |
| 1679428447 | 1 | < 0.1% |
| 1679428446 | 9 | < 0.1% |
| 1679428444 | 2 | < 0.1% |
| 1679425924 | 133 | |
| 1679425923 | 284 | |
| 1679425922 | 302 | |
| 1679425921 | 288 | |
| 1679425920 | 16 | < 0.1% |
| 1678830929 | 15 | < 0.1% |
| YEAR | INTER1 | INTER2 | INTERACTION | LATITUDE | LONGITUDE | FATALITIES | TIMESTAMP | TIME_PRECISION | DISORDER_TYPE | EVENT_TYPE | SUB_EVENT_TYPE | ACTOR2 | ISO | REGION | COUNTRY | ADMIN1 | GEO_PRECISION | SOURCE_SCALE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YEAR | 1.000 | 0.576 | -0.120 | 0.606 | 0.113 | -0.314 | 0.091 | 0.760 | 0.043 | 0.140 | 0.196 | 0.220 | 0.402 | 0.000 | 0.000 | 0.000 | 0.246 | 0.218 | 0.215 |
| INTER1 | 0.576 | 1.000 | -0.453 | 0.843 | 0.211 | -0.432 | 0.073 | 0.555 | 0.048 | 0.616 | 0.679 | 0.702 | 0.639 | 0.005 | 0.004 | 0.005 | 0.359 | 0.369 | 0.206 |
| INTER2 | -0.120 | -0.453 | 1.000 | -0.387 | -0.140 | 0.157 | 0.239 | -0.130 | 0.195 | 0.263 | 0.442 | 0.421 | 0.999 | 0.023 | 0.021 | 0.023 | 0.200 | 0.330 | 0.196 |
| INTERACTION | 0.606 | 0.843 | -0.387 | 1.000 | 0.199 | -0.519 | 0.022 | 0.585 | 0.072 | 0.605 | 0.586 | 0.559 | 0.629 | 0.000 | 0.000 | 0.000 | 0.325 | 0.370 | 0.173 |
| LATITUDE | 0.113 | 0.211 | -0.140 | 0.199 | 1.000 | 0.179 | 0.028 | 0.086 | 0.073 | 0.238 | 0.213 | 0.183 | 0.365 | 0.813 | 1.000 | 0.813 | 0.763 | 0.236 | 0.141 |
| LONGITUDE | -0.314 | -0.432 | 0.157 | -0.519 | 0.179 | 1.000 | -0.032 | -0.305 | 0.058 | 0.389 | 0.334 | 0.258 | 0.273 | 0.109 | 0.045 | 0.109 | 0.878 | 0.334 | 0.184 |
| FATALITIES | 0.091 | 0.073 | 0.239 | 0.022 | 0.028 | -0.032 | 1.000 | 0.070 | 0.053 | 0.000 | 0.011 | 0.006 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.036 | 0.023 |
| TIMESTAMP | 0.760 | 0.555 | -0.130 | 0.585 | 0.086 | -0.305 | 0.070 | 1.000 | 0.047 | 0.189 | 0.202 | 0.199 | 0.470 | 0.267 | 0.378 | 0.267 | 0.459 | 0.175 | 0.127 |
| TIME_PRECISION | 0.043 | 0.048 | 0.195 | 0.072 | 0.073 | 0.058 | 0.053 | 0.047 | 1.000 | 0.157 | 0.289 | 0.344 | 0.234 | 0.018 | 0.026 | 0.018 | 0.104 | 0.097 | 0.100 |
| DISORDER_TYPE | 0.140 | 0.616 | 0.263 | 0.605 | 0.238 | 0.389 | 0.000 | 0.189 | 0.157 | 1.000 | 0.813 | 1.000 | 0.573 | 0.038 | 0.045 | 0.038 | 0.431 | 0.383 | 0.236 |
| EVENT_TYPE | 0.196 | 0.679 | 0.442 | 0.586 | 0.213 | 0.334 | 0.011 | 0.202 | 0.289 | 0.813 | 1.000 | 1.000 | 0.561 | 0.038 | 0.045 | 0.038 | 0.376 | 0.426 | 0.221 |
| SUB_EVENT_TYPE | 0.220 | 0.702 | 0.421 | 0.559 | 0.183 | 0.258 | 0.006 | 0.199 | 0.344 | 1.000 | 1.000 | 1.000 | 0.431 | 0.117 | 0.159 | 0.117 | 0.194 | 0.479 | 0.133 |
| ACTOR2 | 0.402 | 0.639 | 0.999 | 0.629 | 0.365 | 0.273 | 0.000 | 0.470 | 0.234 | 0.573 | 0.561 | 0.431 | 1.000 | 0.636 | 0.521 | 0.636 | 0.316 | 0.516 | 0.227 |
| ISO | 0.000 | 0.005 | 0.023 | 0.000 | 0.813 | 0.109 | 0.000 | 0.267 | 0.018 | 0.038 | 0.038 | 0.117 | 0.636 | 1.000 | 1.000 | 1.000 | 1.000 | 0.007 | 0.067 |
| REGION | 0.000 | 0.004 | 0.021 | 0.000 | 1.000 | 0.045 | 0.000 | 0.378 | 0.026 | 0.045 | 0.045 | 0.159 | 0.521 | 1.000 | 1.000 | 1.000 | 1.000 | 0.003 | 0.069 |
| COUNTRY | 0.000 | 0.005 | 0.023 | 0.000 | 0.813 | 0.109 | 0.000 | 0.267 | 0.018 | 0.038 | 0.038 | 0.117 | 0.636 | 1.000 | 1.000 | 1.000 | 1.000 | 0.007 | 0.067 |
| ADMIN1 | 0.246 | 0.359 | 0.200 | 0.325 | 0.763 | 0.878 | 0.003 | 0.459 | 0.104 | 0.431 | 0.376 | 0.194 | 0.316 | 1.000 | 1.000 | 1.000 | 1.000 | 0.340 | 0.159 |
| GEO_PRECISION | 0.218 | 0.369 | 0.330 | 0.370 | 0.236 | 0.334 | 0.036 | 0.175 | 0.097 | 0.383 | 0.426 | 0.479 | 0.516 | 0.007 | 0.003 | 0.007 | 0.340 | 1.000 | 0.355 |
| SOURCE_SCALE | 0.215 | 0.206 | 0.196 | 0.173 | 0.141 | 0.184 | 0.023 | 0.127 | 0.100 | 0.236 | 0.221 | 0.133 | 0.227 | 0.067 | 0.069 | 0.067 | 0.159 | 0.355 | 1.000 |
| EVENT_ID_CNTY | EVENT_DATE | YEAR | TIME_PRECISION | DISORDER_TYPE | EVENT_TYPE | SUB_EVENT_TYPE | ACTOR1 | ASSOC_ACTOR_1 | INTER1 | ACTOR2 | ASSOC_ACTOR_2 | INTER2 | INTERACTION | CIVILIAN_TARGETING | ISO | REGION | COUNTRY | ADMIN1 | ADMIN2 | ADMIN3 | LOCATION | LATITUDE | LONGITUDE | GEO_PRECISION | SOURCE | SOURCE_SCALE | NOTES | FATALITIES | TAGS | TIMESTAMP | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ROU448 | 20-May-2019 | 2019 | 1 | Political violence | Violence against civilians | Attack | Police Forces of Romania (2016-2019) Coast Guard | NaN | 1 | Civilians (Turkey) | Fishers (Turkey) | 7 | 17 | Civilian targeting | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 2 | Deschide; Hurriyet Daily; News.ro; CNN; TRT Haber | National-International | On 20 May 2019, the Coast Guard of Romania fired at a Turkish fishing boat allegedly fishing in Romania's Exclusive Economic Zone (EEZ), 52 sea miles east of Costinesti [coded to Coast of Constanta]. Three Turkish fishers were reported to wounded. | 0 | NaN | 1649875498 |
| 1 | ROU1885 | 28-March-2022 | 2022 | 1 | Strategic developments | Strategic developments | Disrupted weapons use | Military Forces of Romania (2021-) | NaN | 1 | Unidentified Military Forces | NaN | 8 | 18 | NaN | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 1 | Adevarul; G4media | National | Defusal: On 28 March 2022, Romanian minesweepers conducted a controlled detonation of a sea mine 70km away from the Midia port off the Coast of Constanta. The mine was most likely strayed from a location near Ukraine. | 0 | NaN | 1649184809 |
| 2 | ROU1940 | 28-July-2022 | 2022 | 1 | Demonstrations | Protests | Peaceful protest | Protesters (Romania) | Greenpeace | 6 | NaN | NaN | 0 | 60 | NaN | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 1 | News.ro | National | On 28 July 2022, Greenpeace activists protested in front of an oil tanker off the Coast of Constanta, demanding a stop to the dependence on fossil fuels due to their effect on the climate. | 0 | crowd size=no report | 1659462993 |
| 3 | ROU1945 | 31-July-2022 | 2022 | 1 | Strategic developments | Strategic developments | Disrupted weapons use | Military Forces of Romania (2021-) | NaN | 1 | Unidentified Armed Group (International) | NaN | 3 | 13 | NaN | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 1 | Digi24 | National | Defusal: On 31 July 2022, Romanian Naval Forces defused and destroyed a YAM sea mine off the Coast of Constanta. The mine likely came from the Northern Black Sea region from the Ukrainian coast. | 0 | NaN | 1660055880 |
| 4 | ROU1947 | 04-August-2022 | 2022 | 1 | Demonstrations | Protests | Peaceful protest | Protesters (Romania) | Greenpeace | 6 | NaN | NaN | 0 | 60 | NaN | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 1 | News.ro | National | On 4 August 2022, Greenpeace activists protested in front of an oil tanker off the Coast of Constanta, demanding a stop to the dependence on fossil fuels due to their effect on the climate. | 0 | crowd size=no report | 1660055882 |
| 5 | ROU1961 | 08-September-2022 | 2022 | 1 | Political violence | Explosions/Remote violence | Remote explosive/landmine/IED | Unidentified Military Forces | NaN | 8 | Military Forces of Romania (2021-) | NaN | 1 | 18 | NaN | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 1 | G4media; RFE/RL; Adevarul; Balkan Insight; Digi24 | National-International | On 8 September 2022, a sea mine of unknown origin hit the DM-29 minesweeping ship of the Romanian navy 46km off the Coast of Constanta, damaging it during a demining operation. No casualties. | 0 | NaN | 1663096246 |
| 6 | ROU2026 | 10-December-2022 | 2022 | 1 | Strategic developments | Strategic developments | Disrupted weapons use | Military Forces of Romania (2021-) | NaN | 1 | Unidentified Military Forces | NaN | 8 | 18 | NaN | 642 | Europe | Romania | Constanta | NaN | NaN | Coast of Constanta | 44.156 | 28.948 | 1 | Agerpres; Deschide | National-International | Defusal: On 10 December 2022, the Romanian navy conducted a controlled detonation of a sea mine of unknown origin located off the Coast of Constanta. | 0 | NaN | 1673376402 |
| 7 | ROU2045 | 16-January-2023 | 2023 | 1 | Strategic developments | Strategic developments | Disrupted weapons use | Military Forces of Romania (2021-) | NaN | 1 | Military Forces of Russia (2000-) | NaN | 8 | 18 | NaN | 642 | Europe | Romania | Constanta | Mihai Viteazu | NaN | Canalul Periboina | 44.611 | 28.929 | 2 | Adevarul; News.ro | National | Defusal: On 16 January 2023, Romanian Naval Forces captured and disabled a part of a 57E6 Russian missile that was washed away by the sea near the Canalul Periboina. | 0 | NaN | 1674572555 |
| 8 | TUR14260 | 16-November-2020 | 2020 | 2 | Strategic developments | Strategic developments | Arrests | Military Forces of Turkey (2016-) | NaN | 1 | Civilians (Afghanistan) | Refugees/IDPs (Afghanistan); Civilians (Syria); Refugees/IDPs (Syria); Civilians (Iran); Refugees/IDPs (Iran) | 7 | 17 | NaN | 792 | Middle East | Turkey | Sinop | Sinop | NaN | Coast of Sinop | 42.039 | 35.224 | 2 | Ihlas News Agency | National | Arrests: Around 16 November 2020 (as reported), 115 migrants attempting to cross the border illegally to Romania via boats were stopped off the coast of Sinop and detained by the Coast Guard Command of Turkish Army. The migrants were from Afghanistan, Syria and Iran. | 0 | NaN | 1606149144 |
| 9 | TUR18570 | 03-November-2021 | 2021 | 1 | Strategic developments | Strategic developments | Arrests | Military Forces of Turkey (2016-) | NaN | 1 | Civilians (International) | Refugees/IDPs (International) | 7 | 17 | NaN | 792 | Middle East | Turkey | Istanbul | Sile | NaN | Coast of Sile | 41.253 | 29.713 | 2 | Ihlas News Agency; A Haber | National | On 3 November 2021, 40 migrants attempting to cross the border illegally on a boat were captured off the coast of Sile, Istanbul and detained by the Coast Guard Command of the Turkish army. The migrants were from Afghanistan, Iraq, Iran, Syria and Bangladesh. | 0 | NaN | 1636383675 |
| EVENT_ID_CNTY | EVENT_DATE | YEAR | TIME_PRECISION | DISORDER_TYPE | EVENT_TYPE | SUB_EVENT_TYPE | ACTOR1 | ASSOC_ACTOR_1 | INTER1 | ACTOR2 | ASSOC_ACTOR_2 | INTER2 | INTERACTION | CIVILIAN_TARGETING | ISO | REGION | COUNTRY | ADMIN1 | ADMIN2 | ADMIN3 | LOCATION | LATITUDE | LONGITUDE | GEO_PRECISION | SOURCE | SOURCE_SCALE | NOTES | FATALITIES | TAGS | TIMESTAMP | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 96072 | UKR96252 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Russia (2000-) | NaN | 8 | NaN | NaN | 0 | 80 | NaN | 804 | Europe | Ukraine | Kharkiv | Kupianskyi | Dvorichanska | Zapadne | 49.822 | 37.614 | 2 | Ministry of Defence of Ukraine | Other | On 17 March 2023, Russian forces shelled near Zapadne, Kharkiv. Casualties unknown. | 0 | NaN | 1679425924 |
| 96073 | UKR96253 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Russia (2000-) | NaN | 8 | NaN | NaN | 0 | 80 | NaN | 804 | Europe | Ukraine | Donetsk | Volnovaskyi | Velykonovosilkivska | Zolota Nyva | 47.794 | 36.990 | 2 | Ministry of Defence of Ukraine | Other | On 17 March 2023, Russian forces shelled near Zolota Nyva, Donetsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96074 | UKR96263 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Air/drone strike | Military Forces of Russia (2000-) Air Force | NaN | 8 | NaN | NaN | 0 | 80 | NaN | 804 | Europe | Ukraine | Dnipropetrovsk | Novomoskovskyi | Novomoskovska | Novomoskovsk | 48.638 | 35.246 | 2 | Suspilne Media | National | On 17 March 2023, Russian Shahed drones struck an infrastructure facility in Novomoskovsk, Dnipropetrovsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96075 | UKR96342 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Ukraine (2019-) | NaN | 1 | Military Forces of Russia (2000-) Donetsk People's Militia | NaN | 8 | 18 | NaN | 804 | Europe | Ukraine | Donetsk | Donetskyi | Donetska | Donetsk - Kirovskyi | 47.968 | 37.548 | 1 | DPR Armed Forces Press Service | Other | On 17 March 2023, Ukrainian forces shelled DPR in Donetsk - Kirovskyi, Donetsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96076 | UKR96343 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Ukraine (2019-) | NaN | 1 | Military Forces of Russia (2000-) Donetsk People's Militia | NaN | 8 | 18 | NaN | 804 | Europe | Ukraine | Donetsk | Donetskyi | Donetska | Donetsk - Kuibyshivskyi | 48.023 | 37.728 | 1 | DPR Armed Forces Press Service | Other | On 17 March 2023, Ukrainian forces shelled DPR in Donetsk - Kuibyshivskyi, Donetsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96077 | UKR96344 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Ukraine (2019-) | NaN | 1 | Military Forces of Russia (2000-) Donetsk People's Militia | NaN | 8 | 18 | NaN | 804 | Europe | Ukraine | Donetsk | Donetskyi | Donetska | Donetsk - Kyivskyi | 47.986 | 37.862 | 1 | DPR Armed Forces Press Service | Other | On 17 March 2023, Ukrainian forces shelled DPR in Donetsk - Kyivskyi, Donetsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96078 | UKR96345 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Ukraine (2019-) | NaN | 1 | Military Forces of Russia (2000-) Donetsk People's Militia | Civilians (Ukraine) | 8 | 18 | NaN | 804 | Europe | Ukraine | Donetsk | Donetskyi | Donetska | Donetsk - Petrovskyi | 47.950 | 37.614 | 1 | DPR Armed Forces Press Service | Other | On 17 March 2023, Ukrainian forces shelled DPR in Donetsk - Petrovskyi, Donetsk. Two civilians were killed. | 2 | NaN | 1679425924 |
| 96079 | UKR96346 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Ukraine (2019-) | NaN | 1 | Military Forces of Russia (2000-) Donetsk People's Militia | NaN | 8 | 18 | NaN | 804 | Europe | Ukraine | Donetsk | Horlivskyi | Horlivska | Horlivka | 48.313 | 38.042 | 1 | DPR Armed Forces Press Service | Other | On 17 March 2023, Ukrainian forces shelled DPR in Horlivka, Donetsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96080 | UKR96347 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Shelling/artillery/missile attack | Military Forces of Ukraine (2019-) | NaN | 1 | Military Forces of Russia (2000-) Donetsk People's Militia | NaN | 8 | 18 | NaN | 804 | Europe | Ukraine | Donetsk | Donetskyi | Yasynuvatska | Yasynuvata | 48.130 | 37.859 | 1 | DPR Armed Forces Press Service | Other | On 17 March 2023, Ukrainian forces shelled DPR in Yasynuvata, Donetsk. Casualties unknown. | 0 | NaN | 1679425924 |
| 96081 | UKR96375 | 17-March-2023 | 2023 | 1 | Political violence | Explosions/Remote violence | Remote explosive/landmine/IED | Yuvileine Communal Militia (Ukraine) | NaN | 4 | Police Forces of Ukraine (2019-) | Military Forces of Russia (2000-); Civilians (Ukraine) | 1 | 14 | NaN | 804 | Europe | Ukraine | Kherson | Khersonskyi | Yuvileina | Yuvileine | 46.486 | 33.211 | 1 | BBC News | Regional | On 17 March 2023, suspected partisans blew up a car of a police officer in Yuvileine, Kherson. The victim had been collaborating with Russian forces and allegedly tortured Ukrainian civilians in Nova Kakhovka. The police officer was killed, another woman was injured. | 1 | NaN | 1679425924 |